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METHOD OF IDENTIFYING POLYPEPTIDE MONOBODIES WHICH 
BIND TO TARGET PROTEINS AND USE THEREOF 



This application claims the benefit of U.S. Provisional Patent 
Application Serial No. 60/249,756, filed November 17, 2000, which is hereby 
incorporated by reference in its entirety. 

This invention was made, in part, with funding received from the 
National Institutes of Health grant number R29-GM55042 and the U.S. Army grant 
number DMAD 17-97- 1-7295. The U.S. government may have certain rights in this 
invention. 

FIELD OF THE INVENTION 

The present invention relates generally to polypeptide monobodies, 
more particularly polypeptide monobodies derived from the tenth fibronectin type in 
domain from human fibronectin ("FNfhlO"), as well as methods of identifying such 
monobodies having target protein binding activity, and the use thereof for modulating 
target activity. 

BACKGROUND OF THE INVENTION 

Many biological processes are regulated by proteins. Regulatory 
proteins undergo conformational changes to alter their interactions with partners 
and/or alter their catalytic efficiency. Thus, it is essential to detect conformational 
changes of proteins in order to understand the molecular mechanism underlying their 
functions. Although a large body of in vitro studies has revealed conformational 
changes of proteins, there are no established techniques , to monitor protein 
conformational changes in the cellular environment. Biophysical measurements, such 
as X-ray crystallography, nuclear magnetic resonance, and other spectroscopies, 
typically require purified samples and conditions that are drastically different from 
those inside the cells. It is generally accepted that the "molecular crowding" within the 
cellular environment can significantly affect ligand binding, catalysis, stability and 
folding of macromolecules (Minton, 2000). For example, the structures and the 



relative populations of "active" and "inactive" conformations of a protein may be 
quite different from those determined using in vitro biophysical methods. Therefore, it 
would be of great value to establish a strategy to probe conformations of proteins in 
living cells. 

An alternative approach to direct structure determination is the use of 
conformation-specific probes. Anfmsen and others used conformation-specific 
antibodies to demonstrate reversible unfolding of ribonuclease in in vitro experiments 
(Sachs et al., 1972). Thus, it is conceivable that one can introduce conformation- 
specific probes, such as antibodies, inside cells and determine their respective binding 
affinity to a target to probe conformational changes of the target. To implement this 
strategy, one must first obtain conformation-specific probes and establish detection 
methods for probe binding. However, antibodies and their fragments usually require 
the formation of disulfide bonds for proper folding and, thus, they do not always 
function in the reducing environment inside cells. Also, no general methods are 
available to generate conformation-specific antibodies. Short peptides may also be 
used, but they tend to be rapidly degraded in cells due to their low resistance to 
proteolysis. 

Antibody-mimics, termed "monobodies", formed using a small p-sheet 
protein scaffold such as the tenth fibronectin type HI domain from human fibronectin 
(FNfnlO) have been previously described (Koide et al., 1998). It was shown that 
monobodies with a novel binding function can be engineered by screening phage- 
display libraries of FNfnlO in which loop regions are diversified. FNfnlO does not 
contain disulfide bonds or metal binding sites, is highly stable and undergoes 
reversible unfolding (Koide et al., 1998; Main et al., 1992; Plaxco et al., 1996). While 
the stability of monobodies makes them well suited for intracellular studies, there has 
been no use of monobodies to probe conformations of proteins in living cells. 

A number of disease states are dependent upon nuclear receptor 
activity and conformation. For example, human estrogen receptor a (ERa) normally 
regulates the growth and differentiation of the female reproductive system and those 
of skeletal, neural, and cardiovascular tissues in both males and females (Korach, 
1994). Yet ERa is a therapeutic target of, and a clinical marker for, estrogen- 
responsive breast tumor (Jordan et al., 1992). A diverse group of ligands, including 



antiestrogens that are in clinical use, exist which modulate ER transcriptional 
activation and the physiological response of the hormone 17p-estradiol (E2) (Anstead 
et al., 1997). Because the conformation of ERa as it is involved in disease state is 
unknown, it would be desirable to identify an approach to rapidly classify ERa 
conformation as well as develop a preliminary screening tool for estrogen- and 
antiestrogen-like molecules. Any approach which would function to classify ERa 
conformation and screen estrogen- and antiestrogen-like molecules should also be 
able to be operable with other nuclear receptors: classifying their conformations and 
screening their agonists and antagonists. 

hi addition to screening, another important feature in drug discovery is 
target validation. The majority of target validation methods are based on nucleic acid 
techniques. These include gene knockout (the gene coding for the protein of interest 
is eliminated from the genome of the organism) and antisense DNA (DNA that 
hybridize to the messenger RNA of the protein of interest is produced in the cell to 
inhibit the expression of the protein). These techniques are limited in that some genes 
are essential for the growth of the organism and cannot be deleted, and the effect of 
deleting a protein may be different from inhibiting its function (sometimes only 
partially) with drugs. 

Recently, however, a few methods based on protein technologies have 
been reported (Mhashilkar et al., 1995; Richardson et al., 1995; Colas et al., 1996; 
Cochet et al., 1998; Colas & Brent, 1998; Fabbrizio et al., 1999; Norris et al., 1999). 
Proteins or peptides that bind to the protein of interest ("peptide aptamers") are first 
isolated (typically using combinatorial library screening). Then the peptide aptamer is 
introduced into the organism of interest (typically using an expression vector), and the 
effect(s) of the aptamer is analyzed. For peptide aptamers, constrained peptides that 
are displayed on a protein (Colas et al., 1996; Fabbrizio et al., 1999), linear peptides 
(Norris et al., 1999), and antibody fragments (Mhashilkar et al., 1995) have been 
reported. Though these approaches have been at least in some sense successful, they 
have their limitations. The first two methods use only one contiguous segment of 
peptides for binding, and thus the binding interface achieved by these methods is 
limited. Antibody fragments (e.g, single-chain Fv and Fab) contain disulfide bonds, 
and these disulfide bonds are important for the stability of antibody fragments. The 



cytoplasm of the cell is generally a reducing environment, making it difficult to 
maintain the active conformation of antibody fragments. Thus, antibody fragments 
expressed in the cytoplasm are not always functional (Cochet et al., 1998). 

The present invention overcomes these and other deficiencies in the 

art. 

SUMMARY OF THE INVENTION 

A first aspect of the present invention relates to a fibronectin type m 
(Fn3) polypeptide monobody including: at least two Fn3 p-strand domain sequences 
with a loop region sequence linked between adjacent P-strand domain sequences; and 
optionally, an N-terminal tail of at least about 2 amino acids, a C-terminal tail of at 
least about 2 amino acids, or both; wherein at least one loop region sequence, the N- 
terminal tail, or the C-terminal tail comprises an amino acid sequence which varies by 
deletion, insertion, or replacement of at least two amino acids from a corresponding 
loop region, N-terminal tail, or C-terminal tail in a wild-type Fn3 domain of 
fibronectin, and wherein the polypeptide monobody exhibits nuclear receptor binding 
activity. 

A second aspect of the present invention relates to a fusion protein 
which includes a first portion including a polypeptide monobody of the present 
invention and a second portion fused to the first portion. 

A third aspect of the present invention relates to a DNA molecule 
encoding a polypeptide monobody of the present invention, as well as expression 
vectors and host cells which contain such DNA molecules. 

A fourth aspect of the present invention relates to a combinatorial 
library including: a plurality of fusion polypeptides each including a transcriptional 
activation domain fused to a distinct fibronectin type HI (Fn3) polypeptide monobody, 
the polypeptide monobody including (i) at least two Fn3 P-strand domain sequences, 

(ii) a loop region sequence linked between adjacent P-strand domain sequences, and 

(iii) optionally, an N-terminal tail of at least about 2 amino acids, a C-terminal tail of 
at least about 2 amino acids, or both, wherein at least one loop region sequence, the N- 
terminal tail, or the C-terminal tail includes a combinatorial amino acid sequence 



which varies by deletion, insertion, or replacement of at least two amino acids from a 
corresponding loop region, N-terminal tail, or C-terminal tail in a wild-type Fn3 
domain of fibronectin. 

A fifth aspect of the present invention relates to an in vivo composition 
5 including: a fusion polypeptide of the combinatorial library of the present invention; a 
reporter gene under control of a 5' regulatory region; and a chimeric gene which 
encodes a second fusion polypeptide including a target protein, or fragment thereof, 
fused to the C-terminus of a DNA-binding domain which binds to the 5' regulatory 
region of the reporter gene, wherein binding of the polypeptide monobody of the 
10 fusion polypeptide to the target protein, or fragment thereof, of the second fusion 
polypeptide brings the transcriptional activation domain of the fusion polypeptide in 
sufficient proximity to the DNA-binding domain of the second fusion polypeptide to 
induce expression of the reporter gene. 

A sixth aspect of the present invention relates to a method of 
1 5 identifying a polypeptide monobody having target protein binding activity, which 

method includes: providing a host cell including (i) a reporter gene under control of a 
5' regulatory region operable in the host cell, (ii) a first chimeric gene which encodes 
a first fusion polypeptide including a target protein, or fragment thereof, fused to a C- 
terminus of a DNA-binding domain which binds to the 5' regulatory region of the 
20 reporter gene, and (iii) a second chimeric gene which encodes a second fusion 

polypeptide including a polypeptide monobody fused to a transcriptional activation 
domain; and detecting expression of the reporter gene, which indicates binding of the 
polypeptide monobody of the second fusion polypeptide to the target protein such that 
the transcriptional activation domain of the second fusion polypeptide is in sufficient 
25 proximity to the DNA-binding domain of the first fusion polypeptide to allow 
expression of the reporter gene. 

A seventh aspect of the present invention relates to a method of 
screening a candidate drug for nuclear receptor agonist or antagonist activity, which 
method includes: providing a host cell including (i) a reporter gene under control of a 
30 5' regulatory region, (ii) a first chimeric gene which encodes a first fusion polypeptide 
including a nuclear receptor, or fragment thereof including a ligand-binding domain, 
fused to a C-terminus of a DNA-binding domain which binds to the 5 ' regulatory 
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region of the reporter gene, and (iii) a second chimeric gene which encodes a second 
fusion polypeptide including a polypeptide sequence fused to a transcriptional 
activation domain, the polypeptide sequence binding to the nuclear receptor, or 
fragment thereof, in the absence of both an agonist and an antagonist of the nuclear 
5 receptor, presence of an agonist of the nuclear receptor, presence of an antagonist of 
the nuclear receptor, or presence of both an agonist and an antagonist of the nuclear 
receptor; growing the host cell in a growth medium comprising a candidate drug; and 
detecting expression of the reporter gene, which indicates binding of the polypeptide 
sequence of the second fusion polypeptide to the nuclear receptor, or fragment thereof, 
1 0 such that the transcriptional activation domain of the second fusion polypeptide is in 
l| sufficient proximity to the DNA-binding domain of the first fusion polypeptide to 

allow expression of the reporter gene, wherein modulation of reporter gene expression 
indicates that the candidate drug is either an agonist or an antagonist, or has mixed 
activity. 

15 An eighth aspect of the present invention relates to a kit including: a 

culture system which includes a culture medium on which has been placed at least one 
type of transformed host cell, each of the at least one type of transformed host cell 
comprising (i) a reporter gene under control of a 5' regulatory region, (ii) a first 
chimeric gene which encodes a first fusion polypeptide comprising a nuclear receptor, 
20 or fragment thereof including a ligand-binding domain, fused to a C-terminus of a 

DNA-binding domain which binds to the 5' regulatory region of the reporter gene, and 
(iii) a second chimeric gene which encodes a second fusion polypeptide comprising a 
polypeptide sequence fused to a transcriptional activation domain, the polypeptide 
sequence binding to the nuclear receptor, or fragment thereof, in the absence of both 
25 an agonist and an antagonist of the nuclear receptor, presence of an agonist of the 

nuclear receptor, presence of an antagonist of the nuclear receptor, or presence of both 
an agonist and an antagonist of the nuclear receptor. 

A ninth aspect of the present invention relates to a kit including: a 
plurality of host cells, each including a reporter gene under control of a 5' regulatory 
30 region and a heterologous DNA molecule encoding a first fusion polypeptide 

including a nuclear receptor, or fragment thereof which includes a ligand-binding 
domain, fused to a C-terminus of a DNA-binding domain which binds to the 5' 



regulatory region of the reporter gene; and a vector including a DNA molecule 
encoding a second fusion polypeptide including a transcriptional activation domain 
fused to a polypeptide monobody; wherein upon mutation of the DNA molecule to 
encode a mutant polypeptide antibody and wherein upon introduction of the vector 
into at least a portion of said plurality of host cells, expression of the reporter gene is 
induced upon binding of the polypeptide monobody of the second fusion polypeptide 
to the nuclear receptor, or fragment thereof, of the first fusion polypeptide such that 
the transcriptional activation domain of the second fusion polypeptide is in sufficient 
proximity to the DNA-binding domain of the first fusion polypeptide. 

A tenth aspect of the present invention relates to a method of validating 
target protein activity which includes: exposing a target protein to a polypeptide 
monobody which binds to the target protein and determining whether binding of the 
target protein by the polypeptide monobody modifies target protein activity. 

An eleventh aspect of the present invention relates to a method of 
measuring polypeptide monobody binding affinity for a target protein, which method 
includes: exposing a target protein to an interaction partner which binds the target 
protein and a polypeptide monobody which binds the target protein; and measuring 
the degree to which the polypeptide monobody competes with the interaction partner. 

A twelfth aspect of the present invention relates to a method of 
modulating target protein activity which includes: exposing a target protein to a 
polypeptide monobody which binds the target protein under conditions effective to 
modify target protein activity. 

The two-hybrid system is particularly suitable for the purpose of 
identifying polypeptide monobodies which have activity in binding a target protein 
such as a nuclear receptor. In addition, the two-hybrid system can also be used during 
validation of polypeptide monobody affinity for a target protein and its measuring its 
ability to modulate activity of the target protein. By identifying polypeptides that can 
detect conformational changes on target proteins such as nuclear receptors, the present 
invention allows for drug screening to determine whether candidate drug or 
potentially toxic agents are likely to have the capability to modify nuclear receptor 
activity, either as an agonist, an antagonist, or simply an inactive inhibitor of the 
nuclear receptor. Thus, the polypeptide monobodies which bind to the different 




conformations of the nuclear receptor can be used immediately in assays described 
herein. Moreover, polypeptide monobodies which have activity in modifying nuclear 
receptor activity can be used for therapeutic uses in the treatment of nuclear receptor- 
related diseases or conditions. 

5 

BRIEF DESCRIPTION OF THE DRAWINGS 



Figures 1 A-B are schematic drawings of the structure of the tenth Fn3 
domain of human fibronectin (FNfhlO). p-Strands are labeled as A-G, and the loop 

1 0 regions that are used for target binding in monobodies are also labeled. 

Figure 2 illustrates a nucleotide sequence (SEQ ID No: 1) encoding the 
amino acid sequence (SEQ ID No: 2) of the wild-type FNfhlO. The amino acid 
numbering is according to Main et al. (1992). The BC loop region and the FG loop 
region are shown in boxes. 

15 Figures 3 A-B illustrate the amino acid sequence of the wild-type 

FNfhlO (SEQ ID No: 2, Figure 3A) as well as a mutant FNfhlO (SEQ ID No: 3, 
Figure 3B) which has the Asp-7 residue replaced with a non-negatively charged amino 
acid residue (X), which is preferably either Asn or Lys. As reported in Koide et al. 
(2001), both of these mutations have the effect of promoting greater stability of the 

20 mutant FNfnl 0 at neutral pH as compared to the wild-type FNfnl 0. 

Figures 4A-B schematically illustrate a two-hybrid system. Two 
possibilities exist for interaction between the two fusion proteins: no interaction as 
shown in Figure 4 A or interaction as shown in Figure 4B. 

Figure 5 illustrates the nucleotide sequence (SEQ ID No: 4) for the 

25 coding region of an exemplary prey fusion protein. The FNfnl 0-B42 fusion protein 
(SEQ ID No: 5) was prepared in the library designated pFNB42B5F7. The nucleotide 
sequence that was diversified in this library is shown in bold. The amino acid 
sequence of the combinatorial FNfhlO (underlined, SEQ ID No: 6) is shown fused N- 
terminal to the B42 activation domain. This is opposite to the orientation shown in 

30 Figure 5, although either orientation can be utilized. N denotes a mixture of A, T, G, 
and C; K denotes a mixture of G and T; and Xaa denotes any amino acid residue. 
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Figure 6 illustrates the nucleotide sequence (SEQ ID No: 7) for the 
coding region of another exemplary prey fusion protein. The FNfhlO-B42 fusion 
protein (SEQ ID No: 8) was prepared in the library designated pYT45AB7N. The 
nucleotide sequence region that was diversified in this library is shown in bold. This 
5 library was constructed by inserting seven diversified residues between Prol 5 and 
Thrl6 in the AB loop (residue numbering according to Koide et al., 1998). The amino 
acid sequence of the combinatorial FNfnlO (underlined, SEQ ID No: 9) is shown 
fused C-terminal to the B42 activation domain. N denotes a mixture of A, T, G, and 
C; S denotes a mixture of G and C; and Xaa denotes any amino acid residue. 

1 0 Figure 7 illustrates the nucleotide sequence (SEQ ID No: 1 0) for the 

coding region of another exemplary prey fusion protein. The FNfhlO-B42 fusion 
protein (SEQ ID No: 11) was prepared in the library designated pYT45B3F7. The 
nucleotide sequence region that was diversified in this library is shown in bold. The 
amino acid sequence of the combinatorial FNfnlO (underlined, SEQ ED No: 12) is 

15 shown fused C-terminal to the B42 activation domain. N denotes a mixture of A, T, 
G, and C; K denotes a mixture of G and T; and Xaa denotes any amino acid residue. 

Figure 8 illustrates the nucleotide sequence (SEQ ID No: 13) for the 
coding region of another exemplary prey fusion protein. The FNfhl 0-B42 fusion 
protein (SEQ ED No: 14) was prepared in the library designated pYT47F16. The 

20 nucleotide sequence region that was diversified in this library is shown in bold. The 
amino acid sequence of the combinatorial FNfnlO (underlined, SEQ ID No: 15) is 
shown fused C-terminal to the B42 activation domain. N denotes a mixture of A, T, 
G, and C; K denotes a mixture of G and T; and Xaa denotes any amino acid residue. 

Figure 9 is a map of plasmid of pYT45, which is derived from plasmid 

25 pYESTrp2 (Ihvitrogen, CA) by the introduction of FNfnlO (Koide et al, 1998) so that 
FNfnlO was fused C-terminal to the B42 activation domain. pYESTrp2 and, thus, 
pYT45 includes a T7 promoter sequence upstream of regions coding for (from 5' to 
3') a V5 epitope, a nuclear localization signal, the B42-FNfnlO fusion. 

Figure 10 illustrates the nucleotide sequence (SEQ ED No: 16) of the 

30 B42-FNfhl0 fusion protein in the plasmid pYT45 shown in Figure 9. The amino acid 
sequence (SEQ ID No: 17) for FNfnlO is underlined. 
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Figure 1 1 is a map of plasmid pEGERa295-595, which is derived from 
pEG202 (Origine). pEGERa295-595 includes the E and F domains (residues 295- 
595) of estrogen receptor a. Insertion of the coding sequence for the EF domains 
affords a lexA-ERaEF fusion construct. 
5 Figures 12A-B illustrate the nucleotide sequence (SEQ ID No: 18) of 

the LexA-ERa fusion protein in plasmid P EGERct295-595 illustrated in Figure 1 1 . 
The amino acid sequence (SEQ ID No: 19) for ERa domains E and F is underlined. 

Figures 13A-D illustrate the structure of estrogen receptor a. Figure 
13A illustrates schematically the nuclear receptor domain structure: AF-1, ligand- 
1 0 independent activation function; DBD, DNA-binding domain; and AF-2, ligand- 
dependent activation function. Figures 13B-D are schematic drawings of the crystal 
structures of ERa-LBD illustrating ligand-induced conformational changes. Figures 
13B-C are from Shiau et al., (1988); and Figure 13D is from Tanenbaum et al., 
(1998). Helix 12 is highlighted in black. In Figure 13B, an LXXLL (SEQ ID No: 20) 
1 5 peptide is bound to the coactivator-binding site, but the peptide is omitted in the figure 
for clarity, hi Figure 13D, an aberrant intermolecular disulfide bond forces Helix 12 
to an extended conformation. 

Figures 14A-H illustrate the in vivo binding specificity of ERoc-binding 
monobodies, as tested using quantitative p-galactosidase assays. In Figures 14A-G, 
20 binding specificity toward agonist, antagonist, and selective estrogen receptor 

modulators ("SERM's") are shown. In Figure 14H, Western blotting shows that the 
amount of LexA-ERoc-EF was similar in the presence of different ligands. 
Abbreviations: ICI, ICI1 82,780; RAL, raloxifene; PROG, progesterone; and EtOH, no 
added ligand. 

25 Figures 15A-D illustrate in vivo binding specificity of monobodies to 

different ERa-EF/agonist complexes. Abbreviations: E3, estriol; DES, 

diethylstilbestrol; GEN, genistein; EtOH, no added ligand. 

Figures 16 A- D shows the effects of the F domain on the binding of 

ERa to SRC-1 and monobodies. Quantitative p-galactosidase assays were performed 
30 for yeast two-hybrid strains containing a monobody (or SRC-l)-activation domain 

fusion and either the ERoc-EF or E domain-DNA binding domain fusion proteins. 
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Experiments were performed in the same manner as in Figure 14. Figure 16E is a 
Western blot of yeast cells containing LexA-ERoc-EF (lanes 1 and 2) or LexA-ERa-E 
(lanes 3 and 4) probed with an anti-LexA antibody (top) or anti-ERa-F domain 
antibody (bottom). Yeast cells were grown in the presence (lanes 1 and 3) and 
5 absence (lanes 2 and 4) of E2. Note that these proteins are expressed at a similar level 
and lanes 1 and 2 do not contain degradation products similar to LexA-ERa-E (lanes 
2 and 4). Abbreviations: ICI, ICT1 82,780; RAL, raloxifene; PROG, progesterone; and 
EtOH, no added ligand. 

Figures 17A-D demonstrate the use of a monobody collection as a 

10 chemical sensor. Yeast cells containing E2-, OHT-, and (E2 or OHT)-dependent 

monobodies were strategically placed on 5x5 grids ("No selection"). These cells were 
stamped on growth selection plates (-leu) containing E2, OHT, or no ligand. White 
circles are yeast cells grown on a media plate. 

Figures 18A-D illustrate the in vivo binding specificity of monobody 

15 clones, pYT47AB7N-Al and -Bl, as tested using semi-quantitative P-galactosidase 
assays. Binding specificity toward ER complexed with agonist, antagonist and 
SERMs, respectively, are shown. The top two panels show results with ERoc-EF, 
while the bottom two show results with ERP-EF. Abbreviations used in this figure 
are: ICI, ICI1 82,780; RAL, raloxifene; PROG, progesterone; EtOH, no added ligand. 



20 



DETAILED DESCRIPTION OF THE INVENTION 



As used herein, "polypeptide monobody" is intended to mean a 
polypeptide which includes a P-strand domain lacking in disulfide bonds and 

25 containing a plurality of P-strands, two or more loop regions each connecting one p- 
strand to another P-strand, and optionally an N-terminal tail, a C-terminal tail, or both, 
wherein at least one of the two or more loop regions, the N-terminal tail, or the C- 
terminal tail is characterized by activity in binding a target protein or molecule. More 
specifically, such polypeptide monobodies of the present invention can include three 

30 or more loop regions or, even more specifically, four or more loop regions. The size 



- 12 - 



of such polypeptide monobodies is preferably less than about 30 kDa, more preferably 
less than about 20 kDa. 

Scaffolds for formation of a polypeptide monobody should be highly 
soluble and stable. It is small enough for structural analysis, yet large enough to 
5 accommodate multiple binding domains so as to achieve tight binding and/or high 
specificity for its target. One class of polypeptide monobodies of the present 
invention are characterized by specificity for binding to a nuclear receptor. One 
subclass of polypeptide monobodies of the present invention is characterized by their 
ability to bind to a nuclear receptor which has been previously bound by an agonist 
1 0 thereof. Another subclass of polypeptide monobodies of the present invention is 
characterized by their ability to bind to a nuclear receptor which has been previously 
bound by an antagonist thereof. To achieve the specificity in binding to a nuclear 
receptor (either with or without prior binding by an agonist or antagonist), the amino 
a, acid sequence of the polypeptide monobody has been modified relative to the scaffold 

15 used for its construction. 
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An exemplary scaffold for formation of a polypeptide monobody is the 
fibronectin type HI domain (Fn3). Fibronectin is a large protein which plays essential 
roles in the formation of extracellular matrix and cell-cell interactions; it consists of 
many repeats of three types (types I, II, and HI) of small domains (Baron et al., 1991). 

20 Fn3 itself is the paradigm of a large subfamily (Fn3 family or s-type Ig family) of the 
immunoglobulin superfamily. The Fn3 family includes cell adhesion molecules, cell 
surface hormone and cytokine receptors, chaperonins, and carbohydrate-binding 
domains (for reviews, see Bork & Doolittle, 1992; Jones, 1993; Bork et al., 1994; 
Campbell & Spitzfaden, 1994; Harpez & Chothia, 1994). 

25 Crystallographic studies have revealed that the structure of the DNA 

binding domains of the transcription factor NF-kB is also closely related to the Fn3 
fold (Ghosh et al., 1995; Miiller et al., 1995). These proteins are all involved in 
specific molecular recognition, and in most cases ligand-binding sites are formed by 
surface loops, suggesting that the Fn3 scaffold is an excellent framework for building 

30 specific binding proteins. The 3D structure of Fn3 has been determined by NMR 

(Main et al., 1992) and by X-ray crystallography (Leahy et al., 1992; Dickinson et al., 
1994). The structure is best described as a p-sandwich similar to that of antibody VH 
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domain except that Fn3 has seven (3-strands (Figures 1 A-B) instead of nine. There are 
three loops on each end of Fn3; the positions of the BC, DE, and FG loops 
approximately correspond to those of CDR 1, 2 and 3 of the VH domain. 

Fn3 is small (~ 94 residues, Figure 2), monomelic, soluble, and stable. 
5 It is one of few members of IgSF that do not have disulfide bonds and, therefore, is 
stable under reducing conditions. Fn3 has been expressed in E. coli (Aukhil et al., 
1993). In addition, 17 Fn3 domains are present just in human fibronectin, providing 
important information on conserved residues which are often important for the 
stability and folding (see Main et al., 1992; Dickinson et al., 1994). From sequence 
10 analysis, large variations are seen in the BC and FG loops, suggesting that the loops 

w 

jy* are not crucial to stability. NMR studies have revealed that the FG loop is highly 

'"n flexible; the flexibility has been implicated for the specific binding of the 10th Fn3 to 

f n a.5pi integrin through the Arg-Gly-Asp (RGD) motif. In the crystal structure of 



human growth hormone-receptor complex (de Vos et al., 1992), the second Fn3 

15 domain of the receptor interacts with growth hormone via the FG and BC loops, 
suggesting it is feasible to build a binding site using the two loops. 

The tenth type III module of fibronectin has a fold similar to that of 
immunoglobulin domains, with seven P strands forming two antiparallel p sheets, 
which pack against each other (Figures 1A-B; Main et al., 1992). The structure of the 

20 type H module includes seven p strands, which form a sandwich of two antiparallel 
sheets, one containing three strands (ABE) and the other four strands (C'CFG) 
(Williams et al., 1988). The triple-stranded p sheet contains residues Glu-9-Thr-14 
(A), Ser-17-Asp-23 (B), and Thr-56-Ser-60 (E). The majority of the conserved 
residues contribute to the hydrophobic core, with the invariant hydrophobic residues 

25 Trp-22 and Try-68 lying toward the N-terminal and C-terminal ends of the core, 
respectively. The P strands are much less flexible and appear to provide a rigid 
framework upon which functional, flexible loops can be built. The topology is similar 
to that of immunoglobulin C domains. 

Preferred polypeptide monobodies of the present invention are 

30 fibronectin type HI (Fn3)-derived polypeptide monobodies. Fn3 monobodies include 
at least two Fn3 P-strand domain sequences with a loop region sequence linked 
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between adjacent P-strand domain sequences and optionally, an N-terminal tail of at 
least about 2 amino acids, a C-terminal tail of at least about 2 amino acids, or both. 
The at least one loop region sequence, the N-terminal tail, or the C-terminal tail, or 
combinations thereof include an amino acid sequence which has binding specificity 
5 for a nuclear receptor. To render a loop region sequence, N-terminal tail, or C- 

terminal tail capable of binding to a nuclear receptor, either the loop region sequence, 
the N-terminal tail, the C-terminal tail, or a combination thereof varies by deletion, 
insertion, or replacement of at least two amino acids from a corresponding loop 
region, N-terminal tail, or C-terminal tail in a wild-type or mutant Fn3 scaffold. 
'3 10 One preferred wild-type Fn3 scaffold is the tenth Fn3 domain of human 

fibronectin (FNfhlO), which has an amino acid sequence according to SEQ ID No: 2 
(Figure 3 A). One preferred mutant Fn3 scaffold is the tenth Fn3 domain of human 
fibronectin which has a modified Asp7, which is replaced by a non-negatively charged 
amino acid residue (i.e., Asn, Lys, etc.) as shown in Figure 3B (SEQ ID No: 3). As 
15 reported in Koide et al. (2001), both of these mutations have the effect of promoting 
greater stability of the mutant FNfhlO at neutral pH as compared to the wild-type 
FNfhlO. 

Both the mutant and wild-type FNfhlO are characterized by the same 
structure, namely seven P-strand domain sequences (designated A through G) and six 
20 loop regions (AB loop, BC loop, CD loop, DE loop, EF loop, and FG loop) which 
connect the seven P-strand domain sequences. In SEQ ID Nos: 2 and 3, the AB loop 
corresponds to residues 15-16, the BC loop corresponds to residues 22-30, the CD 
loop corresponds to residues 39-45, the DE loop corresponds to residues 51-55, the 
EF loop corresponds to residues 60-66, and the FG loop corresponds to residues 76- 
25 87. As shown in Figures 1 A-B, the BC loop, DE loop, and FG loop are all located at 
the same end of the polypeptide monobody. 

The nuclear receptor which is bound by a polypeptide monobody of the 
present invention can be a steroid receptor, a thyroid receptor, a retinoid receptor, a 
vitamin D receptor, or orphan nuclear receptor. The polypeptide monobody of the 
30 present invention which binds to a nuclear receptor can be specific for the nuclear 

receptor which has been bound by a particular agonist or class of agonists, specific for 
the nuclear receptor which has been bound by a particular antagonist or class of 
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antagonists, or specific for the nuclear receptor which been bound by neither an 
agonist nor an antagonist. Alternatively, the polypeptide monobody can bind to the 
nuclear receptor regardless of its conformation. 

Exemplary steroid receptors include estrogen receptors (ER-a or ER- 
5 P), androgen receptors, progestin receptors, glucocorticoid receptors, and 

mineralocorticoid receptors. One class of preferred estrogen receptor-specific 
polypeptide monobodies exhibit estrogen receptor binding activity in the presence of 
an estrogen receptor agonist (e.g., estradiol, estriol, diethylstilbestrol, or genistein). 
Another class of preferred estrogen receptor-specific polypeptide monobodies exhibit 
Q 10 estrogen receptor binding activity in the presence of an estrogen receptor antagonist 

S (e.g., hydroxy tamoxifen, ICI1 82780, or raloxifene). Because of their tissue-specific 

functions, chemicals such as hydroxy tamoxifen and raloxifene are classified as 

"4 

■?(\ selective estrogen receptor modulators (SERMs) (Jordan, 1 998). 

''r The polypeptide monobodies of the present invention can be prepared 

15 by recombinant techniques, thereby affording the deletion, insertion, or replacement of 
I** at least two amino acids from a corresponding loop region, N-terminal tail, or C- 

jS terminal tail in a wild-type or mutant Fn3 scaffold. Deletions can be a deletion of at 

least two amino acid residues up to substantially all but one amino acid residue 
appearing in a particular loop region or tail. Insertions can be an insertion of at least 
i 20 two amino acid residues up to about 25 amino acid residues, preferably at least two up 

to about 15 amino acid residues. Replacements can be replacements of at least two up 
to substantially all amino acid residues appearing in a particular loop region or tail. 
According to one embodiment of the polypeptide monobodies, such polypeptide 
monobodies possess an amino acid sequence which is at least 50 % homologous to a 
25 P-strand domain of the FNfhlO. 

The deletions, insertions, and replacements (relative to wild-type or 
previously known mutant) on Fn3 scaffolds can be achieved using recombinant 
techniques beginning with a known nucleotide sequence. A synthetic gene for the 
tenth Fn3 of human fibronectin (Figure 2) was designed which includes convenient 
30 restriction sites for ease of mutagenesis and uses specific codons for high-level protein 
expression (Gribskov et al., 1984). This gene is substantially identical to the gene 
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disclosed in co-pending U.S. Patent Application Serial No. 09/096,749 to Koide filed 
June 12, 1998, which is hereby incorporated by reference in its entirety. 

The gene was assembled as follows: first the gene sequence was 
divided into five parts with boundaries at designed restriction sites (Figure 2); for each 
5 part, a pair of oligonucleotides that code opposite strands and have complementary 
overlaps of about 15 bases was synthesized; the two oligonucleotides were annealed 
and single strand regions were filled in using the Klenow fragment of DNA 
polymerase; the double-stranded oligonucleotide was cloned into the pET3a vector 
(Novagen) using restriction enzyme sites at the termini of the fragment and its 

10 sequence was confirmed by an Applied Biosystems DNA sequencer using the dideoxy 
termination protocol provided by the manufacturer; and these steps were repeated for 
each of the five parts to obtain the whole gene. Although this approach takes more 
time to assemble a gene than the one-step polymerase chain reaction (PCR) method 
(Sandhu et al., 1992), no mutations occurred in the gene. Mutations would likely have 

15 been introduced by the low fidelity replication by Taq polymerase and would have 
required time-consuming gene-editing. Recombinant DNA manipulations were 
performed according to Molecular Cloning (Sambrook et al., 1989), unless otherwise 
stated. 

Mutations can be introduced to the Fn3 gene using either cassette 
20 mutagenesis, oligonucleotide site-directed mutagenesis techniques (Deng & 
Nickoloff, 1992), or Kunkel mutagenesis (Kunkel et al., 1987). 

Both cassette mutagenesis and site-directed mutagenesis can be used to 
prepare specifically desired nucleotide coding sequences. Cassette mutagenesis can 
be performed using the same protocol for gene construction described above and the 
25 double-stranded DNA fragment coding a new sequence can be cloned into a suitable 
expression vector. Many mutations can be made by combining a newly synthesized 
strand (coding mutations) and an oligonucleotide used for the gene synthesis. 
Regardless of the approach utilized to introduce mutations into the monobody 
nucleotide sequence, sequencing can be performed to confirm that the designed 
30 mutations (and no other mutations) were introduced by mutagenesis reactions. 

In contrast, Kunkel mutagenesis can be utilized to randomly produce a 
plurality of mutated monobody coding sequences which can be used to prepare a 
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combinatorial library of polypeptide monobodies for screening. Basically, targeted 
loop regions (or C-terminal or N-terminal tail regions) can be randomized using the 
NNK codon (N denoting a mixture of A, T, G, C, and K denoting a mixture of G and 
T) (Kunkel et al., 1987). 
5 Regardless of the approach used to prepare the nucleic acid molecules 

encoding the polypeptide monobody, the nucleic acid can be incorporated into host 
cells using conventional recombinant DNA technology. Generally, this involves 
inserting the DNA molecule into an expression system to which the DNA molecule is 
heterologous (i.e., not normally present). The heterologous DNA molecule is inserted 
10 into the expression system or vector in sense orientation and correct reading frame. 
3 The vector contains the necessary elements (promoters, suppressers, operators, 

$5 transcription termination sequences, etc.) for the transcription and translation of the 

; inserted protein-coding sequences. 

U.S. Patent No. 4,237,224 to Cohen and Boyer describes the 
1 5 production of expression systems in the form of recombinant plasmids using 
restriction enzyme cleavage and ligation with DNA ligase. These recombinant 
plasmids are then introduced by means of transformation and replicated in unicellular 



if 
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1*1 cultures including prokaryotic organisms and eukaryotic cells grown in tissue culture. 

Recombinant molecules can be introduced into cells via 
f' :f 20 transformation, particularly transduction, conjugation, mobilization, or 

electroporation. The DNA sequences are cloned into the vector using standard 
cloning procedures in the art, as described by Sambrook et al. (1989). 

A variety of host- vector systems may be utilized to express the 
polypeptide monobody or fusion protein which includes a polypeptide monobody. 
25 Primarily, the vector system must be compatible with the host cell used. Host- vector 
systems include but are not limited to the following: bacteria transformed with 
bacteriophage DNA, plasmid DNA, or cosmid DNA; microorganisms such as yeast 
containing yeast vectors; and mammalian cell systems infected with virus (e.g., 
vaccinia virus, adenovirus, etc.). The expression elements of these vectors vary in 
30 their strength and specificities. Depending upon the host-vector system utilized, any 
one of a number of suitable transcription and translation elements can be used. 
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Different genetic signals and processing events control many levels of 
gene expression (e.g., DNA transcription and messenger RNA (mRNA) translation). 

Transcription of DNA is dependent upon the presence of a promoter 
which is a DNA sequence that directs the binding of RNA polymerase and thereby 
5 promotes mRNA synthesis. The DNA sequences of eukaryotic promoters differ from 
those of prokaryotic promoters. Furthermore, eukaryotic promoters and 
accompanying genetic signals may not be recognized in or may not function in a 
prokaryotic system and, further, prokaryotic promoters may not be recognized in or 
may not function in eukaryotic cells. 
J % 10 Similarly, translation of mRNA in prokaryotes depends upon the 

Q presence of the proper prokaryotic signals which differ from those of eukaryotes. 

iff Efficient translation of mRNA in prokaryotes requires a ribosome binding site called 

% the Shine-Dalgarno ("SD") sequence on the mRNA. This sequence is a short 

v3 nucleotide sequence of mRNA that is located before the start codon, usually AUG, 



ill 



15 which encodes the amino-terminal methionine of the protein. The SD sequences are 
complementary to the 3 '-end of the 16S rRNA (ribosomal RNA) and probably 
promote binding of mRNA to ribosomes by duplexing with the rRNA to allow correct 
positioning of the ribosome. For a review on maximizing gene expression, see 
Roberts & Lauer (1979). 

20 Once the DNA molecule encoding the polypeptide monobody has been 

cloned into an expression system, it is ready to be incorporated into a host cell. Such 
incorporation can be carried out by the various forms of transformation noted above, 
depending upon the vector/host cell system. Suitable host cells include, but are not 
limited to, bacteria, yeast cells, mammalian cells, etc. 

25 Polypeptide monobodies of the present invention are particularly well 

suited for expression as fusion proteins in combinatorial libraries to be screened, i.e., 
using a yeast or mammalian two-hybrid system. Thus, another aspect of the present 
invention relates to a combinatorial library which includes a plurality of fusion 
polypeptides. Each of the fusion polypeptides within the combinatorial library 

30 includes a transcriptional activation domain fused to a fibronectin type HI (Fn3) 

polypeptide monobody as described above, with at least one loop region sequence, the 
N-terminal tail, or the C-terminal tail including a combinatorial amino acid sequence 
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which varies by deletion, insertion, or replacement of at least two amino acids from a 
corresponding loop region, N-terminal tail, or C-terminal tail in a wild-type Fn3 
domain of fibronectin. 

The size of the combinatorial library will necessarily vary depending 
5 on the size of the combinatorial sequence introduced into the monobody coding 

sequence (i.e., the number of mutations introduced into a particular loop or tail coding 
sequence). For purposes of screening, however, the combinatorial library is preferably 
at least about 10 3 in size, affording at least about 10 5 transformed cells. Therefore, 
while some redundancy may exist for each individual combinatorial amino acid 
10 sequence, considering the total number of transformants, the combinatorial sequence 
in each individual transformant differs from substantially all other combinatorial 
sequences present in the combinatorial array of transformants. 

■■pi 

fl The combinatorial sequence in each polypeptide monobody can be the 

<p result of deletions, insertions, or replacements of the type described above. In certain 

w- ! 15 aspects of the present invention, the combinatorial amino acid sequence is at least 

f# — about 5 amino acids in length, including one or more deletions, insertions, or 

replacements. In other aspects of the present invention, the combinatorial amino acid 
sequence is at least about 10 amino acids in length, including one or more deletions, 
insertions, or replacements. 
20 Yeast and mammalian two-hybrid systems have been established as 

standard methods to identify and characterize protein interactions in the nucleus of 
yeast cells (Fields & Song, 1989; Uetz & Hughes, 2000). These approaches have 
previously been adapted for combinatorial library screening of specific peptide 
libraries (Colas & Brent, 1998; Mendelsohn & Brent, 1994). 
25 One version of the yeast-two hybrid system has been described (Chien 

et al., 1991) and is commercially available from Clontech (Palo Alto, Calif.). 

Briefly, utilizing such a system, plasmids are constructed that encode 
two fusion proteins, the interaction of which is shown schematically in Figures 4A-B. 
The first fusion protein (also known as "bait") contains the DNA-binding domain 
30 (e.g., LexA) fused to a known protein, in this case a nuclear receptor or fragment 
thereof which includes a functional ligand binding domain (NR-LBD). Any of the 
above-identified nuclear receptors (or fragments thereof which include a functional 
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ligand binding domain) can be used as the bait protein or polypeptide. The second 
fusion protein (also known as "prey") includes an activation domain (e.g., B42) fused 
to an unknown protein, in this case a polypeptide monobody, that is encoded by a 
cDNA which has been recombined into a plasmid as part of a combinatorial cDNA 
library. Both plasmids include a promoter which is operable in yeast cells and which 
has been ligated upstream of the fusion protein coding regions. The plasmids are 
subsequently transformed into a strain of the yeast Saccharomyces cerevisiae that 
contains a reporter gene (e.g., LEU2, lacZ, GFP, etc.) whose expression is regulated 
by the transcription factor's binding site. Neither fusion protein alone can activate 
transcription of the reporter gene. The DNA-binding domain fusion protein cannot 
activate transcription, because it does not provide the activation domain function. The 
activation domain fusion protein cannot activate transcription, because it lacks the 
domain required for binding to its target site (e.g., it cannot localize to the 
transcription activator protein's binding site). If the monobody of the prey is not 
capable of binding to the nuclear receptor ligand binding domain of the bait (Figure 
4A), then no reporter gene product is observed. For example, there is no growth of 
the host yeast observed on (-)leu media and no p-galactosidase activity can be 
observed. In contrast, where interaction between the monobody of the prey and the 
nuclear receptor ligand binding domain of the bait occurs (Figure 4B), a functional 
transcription factor is reconstituted, resulting in expression of the reporter gene which 
can be detected by an assay for the reporter gene product. For example, there is 
growth of the host yeast on (-)leu/(+)galactose media and P-galactosidase activity can 
be observed. 

Thus, the two-hybrid system or related methodology can be used to 
screen activation domain libraries for polypeptide monobodies that interact with a 
known "bait" protein or polypeptide. 

A number of suitable techniques can be utilizes to prepare DNA 
molecules encoding the "bait" and "prey" fusion proteins. Basically, coding 
sequences for the DNA binding domain and the nuclear receptor (or fragments thereof 
which include a functional receptor binding domain) or the activation domain and 
polypeptide monobody are ligated together to afford a single DNA molecule encoding 
a translationally fused "bait" or "prey", respectively. This can be carried out prior to 
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insertion of the particular fusion protein coding sequence into an expression vector 
(containing the appropriate regulatory sequences) or simultaneously therewith. 

Suitable yeast two-hybrid vectors can be derived from any number of 
known vectors. Exemplary bait plasmids include pEG202, pGilda, and pNLexA 
5 (Origine), and pHybLex/Zeo (Invitrogen). Exemplary prey plasmids include 

pYESTrp, pYESTrp2 (Invitrogen), and pJG4-5 (Origine). Suitable yeast-expressible 
promoters for driving expression of the fusion constructs, and the selection genes, if 
applicable, on the bait and prey library vectors, include but are not limited to, GAL1, 
ADH, and CUP. 

10 As noted above, a cDNA library encoding polypeptide monobodies can 

«J be made using methods routinely practiced in the art. Accordingly, the library is 

generated by inserting those cDNA fragments (encoding the monobodies) into a 
vector such that they are translationally fused to the activation domain of B42 or Gal4. 
This library can be co-transformed along with the bait gene fusion plasmid into a yeast 
15 strain which contains, e.g., a lacZ gene, a nutrient marker gene, or a green fluorescent 
protein gene, whose expression is controlled by a promoter which contains a lexA or 
Gal4 activation sequence. 

Figures 5-8 illustrate the coding sequence of different prey fusion 
protein constructs prepared in accordance with the present invention. The FNfnlO- 
20 B42 fusion protein shown in Figure 5 (SEQ ID No: 5) was prepared in the library 
designated pFNB42B5F7 (see Example 1 infra ). This library was constructed by 
randomizing residues 26-30 in the BC loop and randomizing residues 78-84 in the FG 
loop (residue numbering according to Koide et al., 1998). The FNfhlO-B42 fusion 
protein shown in Figure 6 (SEQ ID No: 8) was prepared in the library designated 
25 pYT45AB7N (see Example 1 infra) . This library was constructed by inserting seven 
diversified residues between Pro- 15 and Thr-16 in the AB loop (residue numbering 
according to Koide et al., 1998). The FNfnlO-B42 fusion protein shown in Figure 7 
(SEQ ID No: 11) was prepared in the library designated pYT45B3F7 (see Example 1 
infra ). This library was constructed by randomizing residues 26-30 in the BC loop 
30 and randomizing residues 78-84 in the FG loop (residue numbering according to 

Koide et al., 1998). The FNfhl0-B42 fusion protein shown in Figure 8 (SEQ ID No: 
14) was prepared in the library designated pYT47F16 (see Example 1 infra) . This 
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library was constructed by randomizing residues 78-85 and inserting an additional 
eight randomized residues in the FG loop (residue numbering according to Koide et 
al., 1998). 

Following co-transformation, the resulting transformants are screened 
5 for those that express the reporter gene. If a particular polypeptide monobody 
contains a polypeptide sequence which has activity binding to the nuclear receptor 
ligand binding domain, then the two fusion proteins will be brought together by the 
monobody binding to the nuclear receptor ligand binding domain. As a consequence, 
the B42 or Gal4 activation sequence is brought into sufficient proximity to the LexA 
y, b 10 or Gal4 binding domain, such that an active transcription factor is formed, thereby 

driving expression of the reporter gene (e.g., lacZ, nutrient marker, GFP, etc.). Yeast 
colonies which express lacZ can be detected by their blue color in the presence of X- 
gal, whereas yeast colonies expressing a nutrient marker can be identified by survival 
on nutrient selection media, and yeast colonies expressing a GFP can be detected by 
15 their fluorescence following exposure to an excitatory light source (e.g., of suitable 
wavelength). cDNA containing expressed reporter proteins can then be purified and 
used to produce and isolate the bait gene product interacting protein using techniques 
routinely practiced in the art. 

Colonies expressing the reporter gene can be purified and the (library) 
20 plasmids responsible for reporter gene expression can be isolated. The inserts in the 
plasmids can also be sequenced to identify the proteins encoded by the cDNA or 
genomic DNA. 

fn addition, Finley et al. (1994) or Bendixen et al. (1994) have 
described two-hybrid systems including a step of mating yeast cell colonies by replica- 
25 plating diploids, that is to say by mating colonies of yeast cells. 

U.S. Patent No. 6,1 14,111 to Luo et al. describes one example of a 
mammalian two-hybrid system. Basically, this system includes the same components 
as described for the yeast two-hybrid system, except the various vectors used for 
transformation of mammalian host cells include viral origin of replication components 
30 that require the presence of a viral replication protein to effect replication. The 
reporter vector used in the mammalian two-hybrid system includes both a reporter 
gene and a viral replication protein. Upon binding of the two fusion proteins ("prey" 
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and "bait"), the operator controlling expression of the reporter protein and viral 
replication protein is activated, affording increased transcription of the reporter gene 
and the viral replication protein gene. The viral replication protein can then bind to 
the viral origin of replication on the bait and test vectors to permit replication of the 
5 vector, ensuring survival of the cell due to the selection gene. The reporter gene then 
serves as the basis of a sorting or screening system to isolate cells which have a 
protein-protein interaction, and the test protein may be identified and characterized as 
desired. 

Suitable mammalian two-hybrid vectors can be derived from any 
10 number of known vectors, including but not limited to, pCEP4 (Invitrogen), pCI-NEO 
(Promega), and pBI-EGFP (Clontech). Suitable promoters for driving expression of 
the fusion constructs, and the selection genes, if applicable, on the bait and test 
vectors, include but are not limited to, CMV promoters, SV40, SR-a (Takebe et al., 
1988), respiratory synsitial viral promoters, thymine kinase promoter, P-globin 
15 promoter, etc. 

Based on the in vivo selection of combinatorial libraries containing 
polypeptide monobodies, via yeast or mammalian two-hybrid protocols, a further 
aspect of the present invention relates to an in vivo composition which includes: a 
combinatorial library of the present invention, a reporter gene under control of a 5' 

20 regulatory region; and a chimeric gene which encodes a second fusion polypeptide 
comprising a target protein, or fragment thereof, fused to the C-terminus of a DNA- 
binding domain which binds to the 5' regulatory region of the reporter gene. Upon 
binding of the polypeptide monobody of the fusion polypeptide to the target protein, 
or fragment thereof, of the second fusion polypeptide, the transcriptional activation 

25 domain of the fusion polypeptide is brought into sufficient proximity to the DNA- 
binding domain of the second fusion polypeptide to induce expression of the reporter 
gene. 

The two hybrid system is not limited to nuclear receptors. Virtually 
any target protein that does not self-activate the reporter gene can be used. The two 
30 hybrid system is not suitable for membrane-bound targets. For such targets, the split 
ubiquitin (Johnsson & Varshavsky, 1994) or dihydroforate reductase reconstitution 
can be used (Pelletier et al., 1998). 
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A further aspect of the present invention relates to a method of 
identifying a polypeptide monobody having target protein binding activity. This 
method is carried out by providing a host cell which includes (i) a reporter gene under 
control of a 5' regulatory region operable in the host cell , (ii) a first chimeric gene 

5 which encodes a first fusion polypeptide including a target protein, or fragment 
thereof, fused to a C-terminus of a DNA-binding domain which binds to the 5' 
regulatory region of the reporter gene, and (iii) a second chimeric gene which encodes 
a second fusion polypeptide comprising an polypeptide monobody fused to a 
transcriptional activation domain; and detecting expression of the reporter gene. 

1 0 Reporter gene expression indicates binding of the polypeptide monobody of the 

second fusion polypeptide to the target protein (such that the transcriptional activation 
domain of the second fusion polypeptide is in sufficient proximity to the DNA- 
binding domain of the first fusion polypeptide to allow expression of the reporter 
gene). 

1 5 The target protein can be any protein or polypeptide. A preferred target 

protein is a nuclear receptor of the type described above. 

The polypeptide monobody can be any polypeptide monobody as 
described above, but preferably one which is derived from the tenth Fn3 domain of 
human fibronectin, as described above. 

20 Providing the host cell which expresses the reporter gene and the first 

and second chimeric genes can be achieved through recombinant techniques known in 
the art or otherwise described above. Basically, this includes transforming host cells 
and/or mating recombinant host cells to achieve the recited host cell. For example, a 
cell expressing the reporter gene can be transformed upon introduction of first and 

25 second vectors (e.g., plasmids) which contain, respectively, the first and second 
chimeric genes. The host cell can be either a yeast cell or a mammalian cell. 

The method of carrying out detection of the reporter protein depends 
on the type of reporter protein which is expressed. For example, with the lacZ 
reporter, detection can be carried out by exposing host cells to X-gal and identifying 

30 host cell colonies exhibiting p-galactosidase activity (presence of blue color); with a 
nutrient marker, detection can be carried out by exposing host cells to a nutrient- 
deficient media and identifying yeast colonies that grow on the nutrient-deficient 
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media; or with GFP reporters, detection can be carried out by exposing the host cells 
to an excitatory light source (of appropriate wavelength) and identifying host cells that 
emit light at a particular wavelength (i.e., which is particular for a given GFP). 

In addition, this aspect of the present invention also contemplates 
5 recovering the second chimeric gene from host cells exhibiting reporter protein 

expression (identified as described above), modifying the amino acid sequence of the 
encoded polypeptide monobody, and then repeating the steps of providing and 
detecting (as described above) under more stringent conditions using a modified 
second chimeric gene (which encodes the modified polypeptide monobody). The 
1 0 purpose of this procedure is to identify polypeptide monobodies which have greater 
affinity (lower dissociation constant) for the target protein. In modifying the second 
chimeric gene, mutations can be introduced into the polypeptide monobody coding 
sequence to modify any of the loop regions, either in addition to a loop region which 
Is was originally modified or into a different loop region. For polypeptide monobodies 

15 derived from the tenth Fn3 domain of human fibronectin, mutations can be introduced 
into one or more of the plurality of loop sequences, the N-terminal tail, or the C- 
terminal tail. 

According to another aspect of the present invention, the two-hybrid 
system can be used to screen candidate drugs for agonist or antagonist activity against 
20 nuclear receptors. This method is carried out by first providing a host cell including 
(i) a reporter gene under control of a 5' regulatory region, (ii) a first chimeric gene 
which encodes a first fusion polypeptide including a nuclear receptor, or fragment 
thereof including a ligand-binding domain, fused to a C-terminus of a DNA-binding 
domain which binds to the 5' regulatory region of the reporter gene, and (iii) a second 
25 chimeric gene which encodes a second fusion polypeptide including a polypeptide 
sequence fused to a transcriptional activation domain. The polypeptide sequence can 
bind to the nuclear receptor, or fragment thereof, either in the absence of both an 
agonist and an antagonist of the nuclear receptor, in the presence of an agonist of the 
nuclear receptor, in the presence of an antagonist of the nuclear receptor, or in the 
30 presence of both an agonist and an antagonist of the nuclear receptor. The host cell is 
grown in a growth medium which includes the candidate drug and expression of the 
reporter gene is detected. Reporter gene expression indicates binding of the 
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polypeptide sequence of the second fusion polypeptide to the nuclear receptor, or 
fragment thereof, such that the transcriptional activation domain of the second fusion 
polypeptide is in sufficient proximity to the DNA-binding domain of the first fusion 
polypeptide to allow expression of the reporter gene. Depending upon the nature of 
5 the polypeptide sequence and its binding activity in the presence or absence of 

agonists or antagonists of the nuclear receptor, modulation of reporter gene expression 
can indicate whether the candidate drug is an agonist or an antagonist of the nuclear 
receptor, or whether the candidate drug has mixed activity. 

For example, polypeptide sequences which bind the nuclear receptor 
1 0 only in the presence of nuclear receptor agonists will be capable of indicating that the 
candidate drug has nuclear receptor agonist activity, whereas polypeptide sequences 
which bind the nuclear receptor only in the presence of nuclear receptor antagonists 
will be capable of indicating that the candidate drug has nuclear receptor antagonist 
f activity. Similarly, polypeptide sequences which bind the nuclear receptor only in the 

h& 1 5 presence of both nuclear receptor agonists and nuclear receptor antagonists will be 

capable of indicating that the candidate drug has mixed activity. Finally, polypeptide 

■ 'f&er 

vl sequences which bind the nuclear receptor only in the absence of both nuclear receptor 

agonists and nuclear receptor antagonists will be capable of confirming that a 
candidate drug has no nuclear receptor binding activity. 

20 The polypeptide sequence which is used to perform the candidate drug 

screening can be any polypeptide sequence which has nuclear receptor binding activity 
under the various conditions. Preferably, candidate drugs are screened in up to four 
different types of host cells, each of the four types expressing a different second fusion 
polypeptide which includes a polypeptide sequence specific for binding under the four 

25 recited conditions (i.e., presence of nuclear receptor agonist, presence of nuclear 
receptor antagonist, absence of both nuclear receptor agonist and antagonist, and 
presence of both nuclear receptor agonist and antagonist). Thus, candidate drugs can 
be screened in each of the environments which can define the nature of its nuclear 
receptor binding activity. 

30 According to another embodiment for screening candidate drugs for 

nuclear receptor binding, the polypeptide sequence of the second fusion polypeptide is 
a polypeptide monobody. The polypeptide monobody can be any monobody as 
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described herein, but preferably a polypeptide monobody derived from the tenth Fn3 
domain of human fibronectin. 

As used above, candidate drugs can also refer to potentially toxic 
agents. Regardless of whether the candidate drug is a potentially therapeutic agent or 
5 one which can cause or contribute to development of a disease state (i.e., an endocrine 
disrupter), the same assay can be performed to determine whether the drug or agent 
being screened binds to a particular nuclear receptor and causes the nuclear receptor to 
adopt a particular conformation. 

As described above, the transformed host cells expressing a two-hybrid 
lpji 10 system can be used as sensors for detecting conformationally-dependent nuclear 

;g; receptor binding activity of candidate drugs. Therefore, a related aspect of the present 

sJJ invention relates to a kit for practicing this method of the invention. The kit includes: 

* j a culture system which includes a culture medium on which has been (or can be) 

% placed at least one transformed host cell, each of the at least one transformed host cell 

W 

si 1 5 including (i) a reporter gene under control of a 5' regulatory region, (ii) a first 

chimeric gene which encodes a first fusion polypeptide comprising a nuclear receptor, 
or fragment thereof including a hgand-binding domain, fused to a C-terminus of a 
DNA-binding domain which binds to the 5' regulatory region of the reporter gene, and 
(iii) a second chimeric gene which encodes a second fusion polypeptide including a 

20 polypeptide sequence fused to a transcriptional activation domain. The polypeptide 

sequence can bind to the nuclear receptor, or fragment thereof, either in the absence of 
both an agonist and an antagonist of the nuclear receptor, in the presence of an agonist 
of the nuclear receptor, in the presence of an antagonist of the nuclear receptor, or in 
the presence of both an agonist and an antagonist of the nuclear receptor. 

25 Another kit of the present invention enables a user the flexibility to 

mutate the polypeptide monobody as desired prior to transformation of host cells in a 
two-hybrid system. This kit of the present invention includes: a plurality of host cells, 
each including a reporter gene under control of a 5' regulatory region and a 
heterologous DNA molecule encoding a first fusion polypeptide including a nuclear 

30 receptor, or fragment thereof which includes a ligand-binding domain, fused to a C- 
terminus of a DNA-binding domain which binds to the 5' regulatory region of the 
reporter gene; and a vector including a DNA molecule encoding a second fusion 
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polypeptide including a transcriptional activation domain fused to a polypeptide 
monobody. The vector including the DNA molecule encoding the second fusion 
polypeptide can be present in a host cell. Upon mutation of the DNA molecule to 
encode a mutant polypeptide antibody and introduction of the vector into at least a 
portion of the plurality of host cells, expression of the reporter gene is induced upon 
binding of the polypeptide monobody of the second fusion polypeptide to the nuclear 
receptor, or fragment thereof of the first fusion polypeptide such that the 
transcriptional activation domain of the second fusion polypeptide is in sufficient 
proximity to the DNA-binding domain of the first fusion polypeptide. 

Having identified (i.e., using a two-hybrid system) individual 
polypeptide monobodies which have activity in binding to a target protein, the 
identified monobodies can also be used to validate the target. Thus, another aspect of 
the present invention relates to a method of target validation. Basically, this aspect of 
the present invention is used to demonstrate that inhibiting target protein function 
produces the desired effect. The desired effect can be therapeutic, overcoming a 
disease state, or prophylactic. 

hi addition to nuclear receptors of the type described above, a number 
of targets can be identified and validated, including other signal transducing proteins 
such as G proteins, cell surface receptors (e.g., interleukin 2 receptors, growth 
hormone receptors, BI receptors, integrins, G protein-coupled receptors, etc.), and 
plant signaling proteins (e.g., CLV1/CLV2 receptor kinase complex); cell cycle 
regulatory proteins such as protein kinases (e.g., human CDK2) and protein 
phosphatase (e.g., human CDC25); infectious agent proteins such as virus proteins 
(e.g., HIV TAT, HIV reverse transcriptase, Vpr, Vpu, Nef, etc.), bacterial proteins 
(e.g., dihydropholate reductase, thymidine synthase, etc.), and fungal proteins (e.g., 
CPG-1); apoptosis-related proteins (e.g., Blc-2, IGF-2, p53); and transmembrane 
proteins (e.g., MDR-1, MRP„ etc.). 

Basically, the target-binding activity of a particular polypeptide 
monobody can be determined by performing a two-hybrid system screening for 
binding activity. Once polypeptide monobodies having the requisite binding activity 
have been identified, target protein validation can be conducted. 
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According to one embodiment, the method of validating target protein 
activity can be carried out by exposing a target protein to a polypeptide monobody 
which binds to the target protein and then determining whether binding of the target 
protein by the polypeptide monobody modifies target protein activity. 
5 The exposing is preferably carried out in vivo using a host cell (e.g., a 

bacteria, mammalian cell, or yeast cell). The exposure can be carried out under a 
number of conditions depending upon the type of target protein which is being 
evaluated with a particular polypeptide monobody. 

According to one approach, exposing can be carried out according to a 
1 0 two-hybrid assay with competition. The exposing is performed by co-expressing in a 
single cell including a reporter gene under control of a 5' regulatory region: (i) a first 
?| fusion polypeptide including a transcriptional activation domain fused to a target 

protein co-activator which binds the target protein, (ii) a second fusion polypeptide 
including a target protein fused to a C-terminus of a DNA-binding domain which 
15 binds to the 5' regulatory region of the reporter gene, and (iii) a polypeptide 

monobody which binds the target protein. In this embodiment, absence of reporter 
gene expression indicates that the polypeptide monobody effectively inhibits the 
activity of the target protein and the target protein co-activator. 

Several other approaches can be utilized depending upon the nature of 
20 the target protein activity and whether a target protein has a known activity. 

When activity of the target protein is unknown, mRNA or protein 
expression levels before and after exposure to the polypeptide monobody can be 
detected and then compared to identify proteins which are downstream of a metabolic 
pathway in which the target protein is involved. Modified expression levels indicate 
25 modified target protein activity. 

When a target protein is known to be required for cell growth or 
survival, determining whether target protein activity has been modified can be 
achieved by measuring cell growth or survival after exposure to the polypeptide 
monobody, wherein reduced cell growth or survival indicates inhibition of target 
30 protein activity. 

When a target protein is a pathogen protein involved in host-pathogen 
interaction, the exposing is carried out in a host cell that includes the polypeptide 
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monobody. The host cell is preferably one which is normally susceptible to pathogen 
infiltration and the host cell is exposed to the pathogen (e.g., virus, bacteria, fungus, 
etc.) under conditions which would normally be sufficient to allow for pathogen 
infiltration. To determine whether the polypeptide monobody can modify target 
5 protein activity, the extent of pathogen-induced disease progression is measured in the 
host cell. 

Yet another aspect of the present invention relates to measuring the 
binding affinity of a polypeptide monobody for a target protein. This aspect of the 
present invention is carried out by exposing a target protein to an interaction partner 

1 0 which binds the target protein and a polypeptide monobody which binds the target 
protein and measuring the degree to which the polypeptide monobody competes with 
the interaction partner. 

According to one approach, this is a competitive assay which can be 
carried out in vitro. Typically, the target protein is bound to a substrate and the 

1 5 polypeptide monobody includes a label (e.g., alkaline phosphatase tag or a His(6) tag), 
which allows the degree of monobody binding both in the absence of the interaction 
partner and in the presence of the interaction partner. By measuring the difference 
between the degree of binding under such conditions, it is possible to estimate the 
binding affinity for the polypeptide monobody if the binding affinity of the interaction 

20 partner is known. 

According to another approach, this assay which can be carried out in 
vivo according to a two-hybrid assay with competition. The exposing is performed by 
co-expressing in a cell including a reporter gene under control of a 5' regulatory 
region: (i) a first fusion polypeptide including a transcriptional activation domain 

25 fused to a target protein co-activator which binds the target protein, (ii) a second 
fusion polypeptide including the target protein fused to a C-terminus of a DNA- 
binding domain which binds to the 5' regulatory region of the reporter gene, and (iii) a 
polypeptide monobody which binds the target protein. Where no substantial reduction 
in reporter gene is detected (relative to a control when the polypeptide monobody is 

30 not present), then the binding affinity of the polypeptide monobody is less than that of 
the co-activator. In contrast, where a substantial reduction in reporter gene expression 
is detected relative to the control, then the binding affinity of the polypeptide 
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monobody is similar to or greater than that of the co-activator, indicating that the 
polypeptide monobody effectively competes with the interaction partner for binding to 
the target protein. 

Having validated a polypeptide monobody' s activity in binding a target 
protein and modifying its activity, the tested polypeptide monobodies can therefore be 
used to modulate target protein activity. Thus, a further aspect of the present 
invention relates to a method of modulating target protein activity which includes: 
exposing a target protein to a polypeptide monobody which binds the target protein 
under conditions effective to modify target protein activity. Modification of target 
protein activity is particularly suited for provided therapeutic or prophylactic benefit 
and, therefore, exposure of the polypeptide monobody to the target protein is 
preferably carried out in vivo (e.g., in a yeast cell, bacterial cell, or mammalian cell). 

Having identified and validated that certain polypeptide monobodies 
bind to a target protein (whether it assumes a particular conformation or not), the 
polypeptide monobodies can also be used for therapeutic administration to modify the 
activity of the target protein in vivo. 

For purposes of therapeutic usage, it is preferred that the polypeptide 
monobodies be prepared in substantially pure form. This can be performed according 
to standard procedures. Typically, this involves recombinant expression of the desired 
polypeptide monobody by a host cell, propagation of the host cells, lysing the host 
cells, and recovery of supernatant by centrifugation to remove host cell debris. The 
supernatant can be subjected to sequential ammonium sulfate precipitation. The 
fraction containing the polypeptide monobody of the present invention is subjected to 
gel filtration in an appropriately sized dextran or polyacrylamide column to separate 
the polypeptide monobodies. If necessary, the protein fraction may be further purified 
by HPLC. The isolation and purification of polypeptide monobodies, in particular, 
has previously been reported by Koide et al. (1998). 

According to one embodiment, polypeptide monobodies which bind to 
the estrogen receptor and function as antagonist can be used in treating or preventing 
breast cancer. Exemplary antagonist monobodies are those which inhibit SRC-1 
( infra) . Current breast cancer treatments include the use of antiestrogens such as 
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tamoxifen and raloxifene as chemotherapeutics. Thus, polypeptide monobodies with 
antagonist behavior would also be expected to be useful as a cancer therapeutic. 

A number of known delivery techniques can be utilized for the 
delivery, into cells, of either the polypeptide monobodies themselves or nucleic acid 
molecules which encode them. 

Regardless of the particular method of the present invention which is 
practiced, when it is desirable to contact a cell (i.e., to be treated) with a polypeptide 
monobody or its encoding nucleic acid, it is preferred that the contacting be carried 
out by delivery of the polypeptide monobody or its encoding nucleic acid into the cell. 

One approach for delivering polypeptide monobody or its encoding 
RNA into cells involves the use of liposomes. Basically, this involves providing the 
polypeptide monobody or its encoding RNA to be delivered, and then contacting the 
target cell with the liposome under conditions effective for delivery of the 
polypeptide monobody or RNA into the cell. 

Liposomes are vesicles comprised of one or more concentrically 
ordered lipid bilayers which encapsulate an aqueous phase. They are normally not 
leaky, but can become leaky if a hole or pore occurs in the membrane, if the 
membrane is dissolved or degrades, or if the membrane temperature is increased to the 
phase transition temperature. Current methods of drug delivery via liposomes require 
that the liposome carrier ultimately become permeable and release the encapsulated 
drug at the target site. This can be accomplished, for example, in a passive manner 
wherein the liposome bilayer degrades over time through the action of various agents 
in the body. Every liposome composition will have a characteristic half-life in the 
circulation or at other sites in the body and, thus, by controlling the half-life of the 
liposome composition, the rate at which the bilayer degrades can be somewhat 
regulated. 

In contrast to passive drug release, active drug release involves using 
an agent to induce a permeability change in the liposome vesicle. Liposome 
membranes can be constructed so that they become destabilized when the 
environment becomes acidic near the liposome membrane (Wang & Huang, 1987). 
When liposomes are endocytosed by a target cell, for example, they can be routed to 
acidic endosomes which will destabilize the liposome and result in drug release. 
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Alternatively, the liposome membrane can be chemically modified 
such that an enzyme is placed as a coating on the membrane which slowly destabilizes 
the liposome. Since control of drug release depends on the concentration of enzyme 
initially placed in the membrane, there is no real effective way to modulate or alter 
5 drug release to achieve "on demand" drug delivery. The same problem exists for pH- 
sensitive liposomes in that as soon as the liposome vesicle comes into contact with a 
target cell, it will be engulfed and a drop in pH will lead to drug release. 

This liposome delivery system can also be made to accumulate at a 
target organ, tissue, or cell via active targeting (e.g., by incorporating an antibody or 
10 hormone on the surface of the liposomal vehicle). This can be achieved according to 
known methods. 

Different types of liposomes can be prepared according to Bangham et 
al. (1965); U.S. Patent No. 5,653,996 to Hsu et al.; U.S. Patent No. 5,643,599 to Lee 
et al.; U.S. Patent No. 5,885,613 to Holland et al.; U.S. Patent No. 5,631,237 to Dzau 
15 et al.; and U.S. Patent No. 5,059,421 to Loughrey et al., as well as any other approach 



!# demonstrated in the art. 



An alternative approach for delivery of polypeptide monobodies 
involves the conjugation of the desired polypeptide monobody to a polymer that is 
stabilized to avoid enzymatic degradation of the conjugated monobody. Conjugated 
20 proteins or polypeptides of this type are described in U.S. Patent No. 5,681,81 1 to 
Ekwuribe. 

Yet another approach for delivery of polypeptide monobodies involves 
preparation of chimeric proteins according to U.S. Patent No. 5,817,789 to Heartlein 
et al. The chimeric protein can include a ligand domain and, e.g., a polypeptide 

25 monobody which has activity to bind a cellular target (e.g., a nuclear receptor or other 
cellular protein). The ligand domain is specific for receptors located on a target cell. 
Thus, when the chimeric protein is delivered intravenously or otherwise introduced 
into blood or lymph, the chimeric protein will adsorb to the targeted cell, and the 
targeted cell will internalize the chimeric protein. An exemplary approach is the HIV 

30 Tat protein. 

When it is desirable to achieve heterologous expression of a desirable 
polypeptide monobody in a target cell, DNA molecules encoding the polypeptide 
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monobody can be delivered into the cell. Basically, this includes providing a nucleic 
acid molecule encoding the polypeptide monobody and then introducing the nucleic 
acid molecule into the cell under conditions effective to express the polypeptide 
monobody in the cell. Preferably, this is achieved by inserting the nucleic acid 
5 molecule into an expression vector before it is introduced into the cell. 

When transforming mammalian cells for heterologous expression of a 
polypeptide monobody, an adenovirus vector can be employed. Adenovirus gene 
delivery vehicles can be readily prepared and utilized given the disclosure provided in 
Berkner (1988) and Rosenfeld et al. (1991). Adeno-associated viral gene delivery 
1 0 vehicles can be constructed and used to deliver a gene to cells. The use of adeno- 
associated viral gene delivery vehicles in vivo is described in Flotte et al. (1993) and 
Kaplitt et al. (1994). Additional types of adenovirus vectors are described in U.S. 
Patent No. 6,057,155 to Wickham et al.; U.S. Patent No. 6,033,908 to Bout et al.; U.S. 
Patent No. 6,001,557 to Wilson et al.; U.S. Patent No. 5,994,132 to Chamberlain et 
15 al.; U.S. Patent No. 5,981,225 to Kochanek et al.; U.S. Patent No. 5,885,808 to 
Spooner et al.; and U.S. Patent No. 5,871,727 to Curiel. 

Retroviral vectors which have been modified to form infective 
transformation systems can also be used to deliver nucleic acid encoding a desired 
polypeptide monobody into a target cell. One such type of retroviral vector is 
20 disclosed in U.S. Patent No. 5,849,586 to Kriegler et al. 

Regardless of the type of infective transformation system employed, it 
should be targeted for delivery of the nucleic acid to a specific cell type. For example, 
for delivery of the nucleic acid into tumor cells, a high titer of the infective 
transformation system can be injected directly within the tumor site so as to enhance 
25 the likelihood of tumor cell infection. The infected cells will then express the desired 
polypeptide monobody, allowing the polypeptide monobody to modify the activity of 
its target protein. 

According to one embodiment, the polypeptide monobody (or fusion 
protein which includes the polypeptide monobody) can also include a localization 
30 signal for retention of the monobody in the endoplasmic reticulum. An exemplary 
localization signal is a KDEL amino acid sequence (SEQ ED No: 21) secured via 
peptide bond to the C-terminal end of the polypeptide monobody. 
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Whether the polypeptide monobodies or nucleic acids are administered 
alone or in combination with pharmaceutically or physiologically acceptable carriers, 
excipients, or stabilizers, or in solid or liquid form such as, tablets, capsules, powders, 
solutions, suspensions, or emulsions, they can be administered orally, parenterally, 
5 subcutaneously, intravenously, intramuscularly, intraperitoneally, by intranasal 
instillation, by intracavitary or intravesical instillation, intraocularly, intraarterially, 
intralesionally, or by application to mucous membranes, such as, that of the nose, 
throat, and bronchial tubes. For most therapeutic purposes, the polypeptide 
monobodies or nucleic acids can be administered intravenously. 
10 For injectable dosages, solutions or suspensions of these materials can 

be prepared in a physiologically acceptable diluent with a pharmaceutical carrier. 
Such carriers include sterile liquids, such as water and oils, with or without the 
addition of a surfactant and other pharmaceutically and physiologically acceptable 
carrier, including adjuvants, excipients or stabilizers. Illustrative oils are those of 
1 5 petroleum, animal, vegetable, or synthetic origin, for example, peanut oil, soybean oil, 
or mineral oil. In general, water, saline, aqueous dextrose and related sugar solution, 
and glycols, such as propylene glycol or polyethylene glycol, are preferred liquid 
carriers, particularly for injectable solutions. 

For use as aerosols, the polypeptide monobodies or nucleic acids in 
20 solution or suspension may be packaged in a pressurized aerosol container together 
with suitable propellants, for example, hydrocarbon propellants like propane, butane, 
or isobutane with conventional adjuvants. The materials of the present invention also 
may be administered in a non-pressurized form such as in a nebulizer or atomizer. 

Dosages to be administered can be determined according to known 
25 procedures, including those which balance both drug efficacy and degree of side 
effects. 



EXAMPLES 
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The following examples are provided to illustrate embodiments of the 
present invention but are by no means intended to limit its scope. 
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Materials and Methods 

17p-estradiol (E2) and 4-hydroxy tamoxifen (OHT) were purchased 
from Sigma; diethylstilbestrol, estriol, progesterone were obtained from Steraloids; 
ICI1 82,780 was purchased from Tocris, and raloxifene is a product of Eli Lilly. An 
5 anti-ERa (F domain) antibody, HC-20, was purchased from Santa Cruz Biotech, and 
anti-LexA antibody was kindly provided by Dr. E. Golemis (Fox Chase Cancer 
Center). Secondary antibodies were purchased from Pierce. An estrogen receptor a 
(ERa) cDNA clone was kindly provided by the late Dr. A. Notides (University of 
Rochester Medical Center). The cDNA clone for steroid receptor coactivator-1 (SRC- 

iJJ 10 1) was a generous gift from Dr. B. W. OMalley (Baylor College of Medicine) (Onate 

;R etal., 1995). 

'*!• 

y ; Yeast strains EGY48, MAT a his3 trpl ura3 leu2::6LexAop-LEU2 , 

"H 

9\ and RFY206, MA Ta his 3 A2 00 leu2-3 lys2A201 trpl A: ;hisG ura3-52, have been 

2 described (Gyuris et al., 1993; Finley & Brent, 1994) and were purchased from 

1 5 Origene. Yeast was grown in YPD media or YC dropout media following instructions 
from Origene and Invitrogen. 



Example 1 - Construction of Yeast Two-Hybrid Vectors and Monobody 
Library 

20 The method of Brent and others were followed in the construction of 

vectors (Colas & Brent, 1998; Mendelsohn & Brent, 1994; Golemis & Serebriiskii, 
1997). The synthetic gene for FNfhlO (Koide et al., 1998) was subcloned in the 
plasmid pYESTrp2 (Invitrogen, CA) so that FNfhlO was fused C-terminal to the B42 
activation domain (pYT45). A map of pYT45 is shown at Figure 9. This plasmid 

25 includes a T7 promoter sequence upstream of regions coding for (from 5' to 3') a V5 
epitope, a nuclear localization signal, a B42 activation domain, and a combinatorial 
polypeptide monobody derived from FNfhlO. The nucleotide (SEQ ID No: 16) and 
amino acid sequences (SEQ ID No: 17) for the B42-FNfnlO fusion are shown in 
Figure 10. 

30 The following plasmids encoding LexA-fusion proteins were 

constructed by subcloning an appropriate PCR fragment in the plasmid pEG202 
(Origene): pEGER«297-595, ERoc-EF (residues 297-595, the E and F domains of 



-37- 



Estrogen Receptor a) (Figure 11); pEGERa297-554, ERa-E (residues 297-554, the E 
domain of Estrogen Receptor a); pEGSRCl, residues 570-780 of SRC-1 (Onate et al., 
1995). Figures 12A-B illustrate the nucleotide (SEQ ID No: 18) and amino acid (SEQ 
ID No: 19) of the LexA-ERa fusion protein in plasmid pEGERa295-595. The F 
5 domain is about 45 -residues long, and it is believed to be highly flexible. Potential 
roles of this domain in the ligand-dependent transcription activation have been 
reported (Nichols et al., 1997; Montano et al., 1995). None of the published crystal 
structures of ER-ligand binding domain includes the F domain. The F domain was 
included in one of the constructs so that the bait protein is closer to the full-length ER, 
%, 1 0 rather than just the ligand binding domain. 

9 A number of monobody libraries were constructed by diversifying 

■{f\ residues in several loop regions. Libraries pFNB42B5F7 (Figure 5) and pYT45B3F7 

(Figure 7) were prepared by diversifying residues 26-30 in the BC loop and 
randomizing residues 78-84 in the FG loop (residue numbering according to Koide et 
■■--p 15 al., 1 998). Library pYT45AB7N was prepared by inserting seven diversified residues 

^ between Pro-15 and Thr-16 in the AB loop (residue numbering according to Koide et 

|f al., 1998). Library pYT47F16 was prepared by randomizing residues 78-85 and 

# inserting an additional eight randomized residues in the FG loop (residue numbering 

according to Koide et al., 1998). In each instance, the above-noted residues were 
20 randomized using the NNK codon (N denotes a mixture of A, T, G, C; K denotes a 
mixture of G and T) or NNS codon (S denotes a mixture of G and C) by Kunkel 
mutagenesis (Kunkel et al., 1987). The yeast strain EGY48 was transformed with this 
plasmid to produce a library containing approximately 2xl0 6 independent clones. To 
facilitate fusion protein construction, Ncol and BamHI sites were introduced at the 5' 
25 and 3' ends of monobody genes, respectively, using PCR. 

A yeast expression vector for a glutathione-S-transferase (GST)- 
monobody fusion protein was constructed as follows. The Xbal-Kpnl fragment of the 
modified pYEX4T-l vector that encodes Pcup promotor and GST gene, kindly 
provided by Dr. E. Phizicky (Martzen et al, 1999), was cloned between the Xbal and 
30 Kpnl sites of YEplacl81 (Gietz & Sugino, 1988) to make pGSTleu. Then the gene 
for a monobody (i.e., from the constructed library) was cloned between the Ncol and 
BamHI sites of pGSTleu. 
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Example 2 - Screening of Monobody Library for Estrogen Receptor-a EF 
Domain Specificity in the Presence of a Ligand 

5 The yeast strain RFY206 harboring pEGERoc297-595 and a LacZ 

reporter plasmid, pSH18-34 (Origene), was mated with EGY48 containing the 
monobody library (Finley & Brent, 1994). Diploid cells that contain an ERa-binding 
monobody were selected using the LElf phenotype on minimal dropout media (Gal 
Raf-leu -his -ura -trp). (Although ERcc itself has a weak transcriptional activation 

10 function in yeast (Chen et al., 1997), these constructs did not activate the LEU2 
reporter gene to an extent that confers LElf phenotype in the yeast EGY48.) 

A series of library screening was performed in the presence of different 
ERa ligands (E2, estriol, and OHT). The ligand concentration used was 1 uM. 
Colonies grown after three days of incubation were further tested for galactose- 

15 dependence of the LElf phenotype and [3-galactosidase activity. The plasmids coding 
for a monobody were recovered from yeast clones following instructions supplied by 
Origene, and the amino acid sequences of monobodies were deduced by DNA 
sequencing. 

Quantitative assays were performed as follows. The yeast strain 
20 RFY206 was (1 ) first transformed with pEGERcc297-595 (or pEGERoc297-5 54) and 
pSH 18-34 and (2) subsequently with a derivative of the pYT45 plasmid encoding a 
particular monobody. Yeast cells were grown overnight at 30°C in YC Glc -his -ura 
-trp media. The culture was then spun down, the media were discarded, and the cells 
were resuspended in YC Gal Raf -his -ura -trp media containing a ligand at a final 
25 cell density of 0.2 OD 6 60nm in a total volume of 175 ul in the wells of a deep 96-well 
plate. Ligands used were E2, ICI1 82,780, OHT, raloxifene, progesterone, estriol, 
diethylstilbestrol, and genistein. The ligand concentration was 1 uM except for 
genistein (10 uM). After incubating for six hours at 30°C with shaking, 175 pi of P- 
galactosidase assay buffer (60mM Na 2 HP0 4 , 40mM NaH 2 P0 4 , lOmM KC1, lmM 
30 MgS0 4 , 0.27% p-mercaptoethanol, 0.004% SDS, 4mg/ml 2-nitrophenyl-p-D- 

galactosidase, 50% Y-PER (Pierce)) was added to the culture, incubated at 30°C, then 
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the reaction was stopped by adding 150 [il of 1M Na 2 C03. After centrifugation, OD 42 o 
was measured and the p-galactosidase activity was calculated. 

Western blotting was used to examine the amounts of the LexA fusion 
and monobody proteins in yeast cells used for P-galactosidase assays. Yeast cells were 
5 grown in the same manner as for the P-galactosidase assays described above. Yeast 
cells were spun down to discard media and weighed. The cells were suspended in 5ul 
Y-PER (Pierce) per mg cell, then ImM PMSF and 540ug/ml Leupeptine were added, 
and the samples were incubated at room temperature for 20 min with gentle agitation. 
The suspension was spun down, supernatant was recovered, and the pellet was 
10 resuspended in 5mM Tris-Cl (pH8.0). The supernatant and suspension were examined 
by Western blotting. 

^ ■ Multiple positive clones were obtained from each screening and their 

"'4 

Ji amino acid sequences were determined, as shown in Table 1-4 below. 
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Table 1: Estrogen Receptor-Binding Clones Obtained from the pFNB42B5F7 Library 

Binding 

Amino Acid Sequence Specificity * 

Initial Clone 

Screen Name BCloop FGloop E2 ICI 



E2 


Bl 


AVTVR (wild type) 




GILEMLQ (SEQ ID No: 25) 


+ 


ND 


E2 


C2 


WYQGR (SEQ ID No: 


22) 


RLRAQLV (SEQ ID No: 26) 


+ 


ND 


E2 


Dl 


AVTVR (wild type) 




PVRVLLR (SEQ ID No: 27) 


+ 


ND 


E2 


El 


PRTKQ (SEQ ID No: 


23) 


RLRDLLQ (SEQ ID No: 28) 


+ 


ND 


ICI 


A4 (=E1) 


PRTKQ (SEQ ID No: 


23) 


RLRDLLQ (SEQ ID No: 28) 


+ 


ND 


ICI 


A6 


AVTVR (wild type) 




GLVSLLR (SEQ ID No: 29) 


+ 


ND 


ICI 


B3 


AVTVR (wild type) 




RKWWTG (SEQ ID No: 30) 




WEAK 


ICI 


C3 


VRRPP (SEQ ID No: 


24) 


TAAIMVK (SEQ ID No: 31) 




WEAK 



*Binding specificity of the obtained clones were determined using survival assay. 
Note: wild-type refers to residues 26-30 of SEQ ID No: 2. 



Monobodies that have been selected in the presence of an agonist (E2 and E3) contain 
motifs similar to LXXLL (SEQ ID No: 20, where X is any amino acid) that is the 
consensus of the NR boxes of coactivators (Heery et al., 1997). Interestingly, a 
significant number of LXXML (SEQ ID No: 32, where X is any amino acid) 
20 sequences were present among these clones. Because of the degeneracy of the 

codons, Leu is expected to appear three times as often as Met at a given position that 
was diversified in the library, suggesting that Met in the LXXML (SEQ ID No: 32) 
sequence is preferred over Leu. In addition, many of the clones contain an amino acid 
with a carboxyl or amino side chain at the third position of the LXXLL (SEQ ID No: 
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20)-like motifs. These motifs bear striking resemblance to the LLEML (SEQ ID No: 
33) sequence within helix 12 of ERa and p. In the ERa/OHT crystal structure, the 
LLEML (SEQ ID No: 33) segment of helixl2 occupies the coactivator binding site 
(Figure 13C) (Shiau et al., 1998). The sequence similarity of the isolated monobodies 
5 to the coactivator motif strongly suggests that these monobodies directly bind to ERa. 
In contrast, monobodies identified from screening in the presence of OHT contain an 
amino acid sequence that is distinctly different from the LXXLL (SEQ ID No: 20) 
motif. These sequences do not show obvious homology to those of linear peptides 
selected for binding to the ERa/OHT complex by Norris et al. (1999). 

10 

Table 2: Estrogen Receptor-Binding Clones Obtained from the pYT45AB7N Library 



Clone Name 


Amino Acid Sequence in the AB Loop 




Pis T 16 (wild type) 




PXXXXXXXT (library) 


Al 


WTWVLRE (SEQ ID No: 34) 


Bl 


WVLITRS (SEQ ID No: 35) 


Note: Library denotes residues 17-25 


in SEQ ID No: 9. 



Table 3: Estrogen Receptor-Binding Clones Obtained from the pYT45B3F7 Library 



Initial 
Screen 


Clone 
Name 


Amino Acids Sequence 
in FG Loop 


E2 


Binding Specificity* 

No 

DES Gen. ICI OHT Ligand 


E2 


23,3 1,E3 1,3,4,5 


LRLMLAG (SEQ ID No: 36) 


+ 


+ 


+ 


+ 




E2 


F2-2#3 


ALVEMLR (SEQ ID No: 37) 


+ 


+ 


+ 






E2 


F2-2#4 


RLLWNSL (SEQ ID No: 38) 


+ 


+ 


+ 






E2 


F2-2#5, Geni H4 


RVLMTLL (SEQ ID No: 39) 


+ 


+ 


+ 


? 




E2 


F2-2#7,#12 


GLRRLLR (SEQ ID No: 40) 


+ 


+ 


+ 


? 




E2 


F2-2#8 


GLRQMLG (SEQ ID No: 41) 


+ 


+ 


+ 


+ 




E2 


F2-2#9 


RVLHSLL (SEQ ID No: 42) 


+ 


ND 


ND 


+ 




E2 


F2-2#10 


RVRDLLM (SEQ ID No: 43) 


+ 


ND 


ND 


weak+ 




E2 


F2-2#ll 


RVMDMLL (SEQ ID No: 44) 


+ 


ND 


ND 


+ 




E3 


2 


GIAELLR (SEQ ID No: 45) 


+ 


+ 


+ 


+ 




E3 


6,7 


RILLNMLT (SEQ ID No: 46) + 


+ 


+ 


+ 


+ + 


OHT 


31 


GGWLWCVT (SEQ ID No: 47) 








+ 


+ 


OHT 


32 


TWWRRV (SEQ ID No: 48) 








+ 


+ 


OHT 


33 


TWVRPNQ (SEQ ID No: 49) 








+ 


+ 


ICI 


16-3 A 


RRVPIWC (SEQ ID No: 50) 


+ 


+ 


+ 


+ 




Genistein 


Dl 


RRVYDFL (SEQ ID No: 51) 


+ 




+ 






Genistein 


El 


LRQMLAD (SEQ ID No: 52) 


+ 




+ 






Genistein 


E4,D6 


GLRMLLR (SEQ ID No: 53) 


+ 




+ 







All the clones obtained from these screening trials contained the wild-type sequence in the BC loop. 
* Binding specificity of the obtained clones were determined using survival assay. 

Abbreviations for ligands are: E2, 1 7p-estradiol; E3, estriol; DES, diethylstilbestrol; Gen., Genistein ; ICI, 
ICI 182,780; OHT, 4-hydroxy tamoxifen. 
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Initial 
Screen 


Clone 
Name 


Amino Acids Sequence 
in FG Loop 




Binding Specificity* 

No 

DES Gen. ICI OHT Ligand 


E2 




S RRL vEHL AGVE V QALi 




+ 




+ 


+ - - 


E2 


If 


LVARMLiDW fciDLjEiEi Ab F 


f<2FO TF> Nn' SSI 


+ 




+ 


+ - - 


Ez 


AQ 


QGKGRRRGLVLYLLGS 


^BH TF1 Mr*- Sr^ 
^olly 1L» iNO. JOJ 


+ 






+ - - 


E2 


B 


RLRELLAEAAQASDGE 


/■QUO TTi "Mr*- S^ 


+ 


+ 




+ - - 


E2 


z 


LLLRVGCGCRJj VCjb V J-i 


/ C I .'O Tn T\Jn- SR^ 
^oliV^ LLf 1NO. Jo) 


+ 




4- 


+ - - 


E2 


6 


RLSIVPCPAWARLi I VJj 








+ 


+ + - 


E2 


ll 


LLVGLLLLRGARSGS T 


^oHvi 1JJ iNO. ou) 


+ 






+ - - 


E3 


12 


L I YGLLSQPEERDEWR 


(SEQ ID JNo: oi) 


4- 






-4- 4- 


E3 


13 




(SEQ ID No: 62) 




+ 


+ 


+ - - 


E3 


14 


WFDHERHGMLWQLLLR 


(SEQ ID No: 63) 




+ 


+ 


+ 


E3 


15 


RLWCLLQRKGRNPIDM 


(SEQ ID No: 64) 




+ 


+ 


+ 




13 14 ?0 


RVFFG I GCRGGTGGGN 


(SEQ ID No: 65) 








+ 


OHT 


21 


RVRFRCGGRDAASGDQ 


(SEQ ID No: 66) 








+ 


OHT 


1,5 


LVRFRVVNS SLCMWAR 


(SEQ ID No: 67) 








+ 


OHT 


2 


LVRLGVAGHMDAGAGR 


(SEQ ID No: 68) 








+ 


OHT 


4,22 


PADGSEVLRLVKIHYV 


(SEQ ID No: 69) 








+ 


OHT 


24 


RLEYGDVIGAVWWGRV 


(SEQ ID No: 70) 




ND 


ND 


+ 


OHT 


3 


QGAAVRTLVAGGGVAS 


(SEQ ID No: 71) 


+ 


+ 


+ 


+ + - 


OHT 


6 


LEVRVAAGC I AGGGRR 


(SEQ ID No: 72) 


+ 


+ 


+ 


+ + - 


ICI 


16-4B 


RLWRMLSGEPARVDHE 


(SEQ ID No: 73) 


+ 


+ 


+ 


+ + + 



* Binding specificity of the obtained clones were determined using survival assay. 

Abbreviations for ligands are: E2, 17(3-estradiol; E3, estriol; DES, diethylstilbestrol; Gen., Genistein ; 
ICI, ICI1 82,780; OHT, 4-hydroxy tamoxifen. 
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Example 3 - Discrimination of Estrogen Receptor-a Conformations in Living 
Cells Using Conformation-Specific Monobodies 

The binding specificity of the monobodies toward different ERa- 
10 EF/ligand complexes was examined using quantitative p-galactosidase assays. It has 
been shown that the (3-galactosidase activity correlates well with the interaction 
affinity between the bait and prey of the yeast two-hybrid system (Estojak et al, 
1995), allowing an in vivo discrimination of interaction affinity. To minimize the 
effect of different ligands on the expression level and degradation of the LexA-ER 
1 5 fusion protein, P-galactosidase activity was determined after a short incubation period 
(6 hours) following the addition of a ligand and the initiation of monobody 
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production. It was confirmed that yeast samples prepared in the presence and absence 
of ligands contained similar levels of ERoc-EF protein (Figure 14H). In addition, it 
was found that these ligands have little effect on the expression level of monobodies. 

The in vivo interaction between these monobody clones and ERoc-EF 
5 was tested in the presence of different ERa ligands (Figures 14A-G). In general, 

monobody clones selected for an ERa-EF/agonist (E2 and estriol) complex interacted 
with ERoc-EF in the presence of E2, but not in the presence of OHT or other 
antagonists. The binding specificity of these clones is similar to that of the NR-box 
fragment of the coactivator, SRC-1, suggesting that these clones recognize a surface 
10 of ER-LBD that is used for coactivator binding. The clone, E3#6, showed weak but 
significant interaction with the ERoc-EF/raloxifene complex (Figure 14D). In an 
analogous manner, monobodies selected for the ERoc-EF/OHT complex were specific 
to the same complex (Figure 14E). In addition, the affinity of the selected 
monobodies to an unrelated protein (the pBait control protein; Origene) was below the 
1 5 detection limit of our assay. 

The effects of different agonists on the interactions between ERct-EF 
§ and monobodies were also tested (Figures 15A-D). Clone E2#l 1 showed different 

reactivity to different agonist-complexes of ERoc-EF (Figure 15D), while clone E2#23 
and the NR-box fragment of coactivator SRC-1 bind equally well to these agonist 
20 complexes (Figures 1 5 A-C). Taken together, these results demonstrate that one can 
isolate monobodies that are specific to different conformations of ERoc-EF, and that 
one can use such monobodies to detect conformational differences of ERcc-EF in the 
nucleus induced by various ligands, even small changes induced by different agonists. 

The profile (Figure 18A-B) of in vivo interaction between ERa-EF and 
25 monobodies from the pYT45AB7N libraty (Table 2) were distinct from those between 
ERa-EF and monobodies from the other libraries (Figures 14A-H). The two 
monobodies, Al and Bl, from the pYT45AB7N library were selected in the presence 
of estradiol. Nevertheless, they do not contain the consensus LXXLL (SEQ ID 
No:20)-like sequence (Table 2). Moreover, Al and Bl bind equally well to the 
30 estradiol- and hydroxytamoxifen-complexes of ERa-EF (Figures 18A-B). These 
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results demonstrate that monobodies with distinct functions can be obtained by 
screening libraries in which different loop regions are diversified. 

Furthermore, the interaction specificity of these two monobodies to 
ERa and ER{3 is quite different (compare Figures 18A-B with 18C-D). These results 
5 suggest that these monobodies can discriminate the surface properties of ERa from 
those of ER(3. ERP cDNA clone was kindly provided by Dr. M. Muyan of the 
University of Rochester Medical Center. A prey plasmid, pEGERp248-530, was 
constructed by cloning the DNA fragment corresponding to the EF domains of ER(3 

j^s, (residues 248-530) into pEG202 in the same manner as for construction of 

% 10 pEGERa297-554. 

i% ' 

'"'4 Example 4 - Roles of the F Domain on the Conformational Dynamics of the 

$P Estrogen Receptor-a Ligand-Binding Domain 

'■HI? 

hf. 1 5 The affects of the F domain (residues 55 1-595) on interactions of 

y, monobodies with the LBD (the E domain) of ERa was tested. The p-galactosidase 

ll activity of cells containing a LexA-ERa E domain fusion protein and a monobody- 

H activation domain fusion protein was compared to the P-galactosidase activity of cells 

containing LexA-ERa-EF and the same monobody-activation domain fusion protein 
20 (Figures 16A-E). It was confirmed that the expression levels of ERa-E and -EF bait 
proteins were similar, and that the cells containing the ERoc-EF fusion protein do not 
have breakdown products similar to the ERa-E fusion protein (Figure 16E). In the 
presence of E2, the deletion of the F domain had little effect on the interactions of 
E2#23, E3#6 and SRC-1 with the ERa fragments (Figures 16A-C), suggesting that the 
25 F domain does not constitute the binding site for these proteins. In contrast, the 

deletion of the F domain resulted in a significant increase (more than 100-fold in P- 
galactosidase activity) in binding of E3#6 and SRC-1 to ERa in the absence of a 
bound ligand (Figures 16A-B). A somewhat similar effect of the F domain was 
observed for the binding of the clone OHT#33. OHT#33 interactions were similar 
30 with ERa-E and ERa-EF in the presence of OHT, while the interaction of this 

monobody with the ERa-E/raloxifene complex was significantly greater than that with 
the ERa-EF/raloxifene complex (Figure 16D). In contrast to the data with 
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monobodies that bind to ERa/agonist complexes, the deletion did not increase the 
interaction of OHT#33 and ERa in the absence of a ligand. 

Example 5 - Use of Polypeptide Monobodies as Sensors 

5 

As described above, the collection of yeast strains that respond 
differently to different ER-ligand complexes can potentially be used as sensors for ER 
ligands. As shown in Figures 17A-D, arrays of yeast can be grown on a solid 
medium, with each colony expressing a particular monobody having an affinity for 

1 0 ER-a in the presence of an agonist or antagonist. The array in Figure 1 7 A shows p- 
galactosidase activity in the absence of an agonist or antagonist, whereas the array in 
Figure 17B shows no (3-galactosidase activity in the absence of an agonist or 
antagonist. Figures 17C-D demonstrate, respectively, detectable P-galactosidase 
activity in the presence of E2 (agonist) and OHT (antagonist). Thus, it is possible to 

1 5 identify new agonist or antagonist compounds which have an affinity for the ER-a 
based upon their interaction with yeast expressing both a LexA-ERa E or EF domain 
fusion protein and a monobody-activation domain fusion protein. New agonists 
having E2-like binding should produce results similar to those shown in Figure 17C, 
whereas new antagonist having OHT-like binding should produce results similar to 

20 those shown in Figure 17D. 

Example 6 - Use of Polypeptide Monobodies to Modulate Estrogen Receptor 
Interactions 

25 The interaction between ER and the natural coactivator, SRC-1, was 

examined in the presence of a polypeptide monobody. The yeast two-hybrid system 
that monitored the interaction between ERa-EF and SRC-1 was used. The monobody 
E2#23 was co-expressed under the control of a separate promotor. p-Galactosidase 
activity in the presence of E2 decreased by approximately 30% when the monobody 

30 was expressed, while co-expression of the wild-type FNfhlO did not alter the level of 
the marker enzyme activity. This inhibitory effect was reduced when the expression 
level of the SRC-1 -activation domain fusion was increased. These results suggest that 
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the monobody binds to the coactivator-binding site of ERa in a competitive manner 
against SRC-1 . It is likely that increased expression levels of the monobodies would 
further augment the observed inhibition. Thus, these results suggest that it 
monobody-based inhibitors of nuclear receptors can be developed. 
5 Thus, a collection of yeast two-hybrid cells containing a nuclear 

receptor ligand binding domain and an appropriate monobody can be used for 
screening of drug-like molecules (Chen et al., 1997; Nishikawa et al., 1999). By 
expressing the nuclear receptor in yeast, the system is not limited by the presence of a 
natural protein that interacts with the nuclear receptor in the presence of a particular 
10 ligand. Thus, it should be possible to develop screening systems for chemicals that 
induce a nuclear receptor into a conformation similar to that induced by a known 
nuclear receptor ligand. 

Discussion of Examples 1-6 

15 The above Examples demonstrate monobodies that are specific to a 

particular conformation of ERa can be obtained, and that one can probe 
conformational changes of ERa in living cells using such monobodies. The ability of 
detecting conformational changes of proteins in the native environment should bridge 
the gap that currently exists between high-resolution structural information obtained 

20 from in vitro techniques and functional information from cell biology studies. The use 
of engineered probes for conformational change, such as monobodies described here, 
allow discrimination of a wider variety of conformations than those that are 
responsible for interactions of the target protein with other natural proteins. In 
addition to probing ligand-induced conformational changes, the above-demonstrated 

25 approach can detect effects of mutations, e.g., the deletion of the F domain. 

In the present study, a yeast two-hybrid system was used as the means 
to detect interactions of monobodies with a target in living cells. The yeast two-hybrid 
system detects interactions in the nucleus. This is ideally suited for the investigation 
of conformational changes of nuclear receptors that function in the nucleus. Clearly, 

30 this work can be extended using the mammalian two-hybrid method. However, 
alternative methods may be better suited for probing conformational changes of 
proteins that are naturally located outside the nucleus. Potential methods include the 
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split ubiquitin system (Johnsson & Varshavskiy, 1994) and dihydroforate reductase 
reconstitution (Pelletier et al., 1998). Indeed, Raquet et al. reported the use of the 
split-ubiquitin system to detect conformational differences of a protein in living cells 
(Raquet et al., 2001). The present invention, using conformation-specific 
monobodies, could readily be adapted to these systems. The conformational changes 
of ERcc-E and ERa-EF as discriminated by the above-identified monobody collection 
generally agree with the conformational differences of ERcc- and ER(3-E domains 
found in a series of crystal structures. Thus, the above results support that these crystal 
structures represent relevant conformations of ER in cells. However, a dramatic 
increase in the interactions of the monobody E3#6 and ERcc was identified upon the 
deletion of the F domain (Figure 14). A similar effect was observed between SRC-1 
and ERcc. These results maybe interpreted as a dynamic conformational equilibrium, 
in which ERa-E, in particular, helix 12 (Figures 13A-B) is in equilibrium among 
multiple conformations and the presence of the F domain shifts this equilibrium away 
from the "active" conformation. A number of mutations at residues 536 and 537, 
which are located in the loop connecting helices 1 1 and 12, resulted in a constitutively 
active phenotype (Weis et al., 1996; White et al., 1997; Zhang et al, 1997; Eng et al., 
1997), suggesting that these mutations can shift the conformational equilibrium within 
the LBD. A series of ERP LBD crystal structures also suggest the dynamic nature of 
helix 12. In the genistein complex (Shiau et al., 1998), helix 12 is in a position similar 
to that found in the ERp-antagonist structure, as opposed to the "agonist" 
conformation that is expected from the partial agonist activity of genistein. In the 
structure of ERp bound to an antagonist, ICI164,384, the electron density for the 
entire helix 12 is missing, suggesting a conformational disorder (Pike et al., 2001). 
Furthermore, an NMR study of the LBD of peroxisome proliferator-activated receptor 
y, another member of the nuclear receptor family, revealed that the apo-LBD, 
particularly ligand- and cofactor-binding regions, is in a dynamic conformational 
ensemble (Johnson et al., 2000). Since the F domain of ERot is quite large (-45 
residues) and it is directly linked to helix 12, it is plausible that the F domain can 
affect the balance of the conformational ensemble of the E domain even if the F 
domain is largely unstructured. It should be noted that the observed effect of the F- 
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domain deletion maybe mediated through a change in association of ERa with other 
macromolecules such as heat shock proteins. These results demonstrate that our 
approach can reveal conformational dynamics of a target protein in living cells, and 
thus it can provide useful information complementary to static information obtained 
from X-ray crystal structure. 

The above results (Figures 14-16) demonstrate that different agonists 
induce somewhat different conformations of ERa-EF, and that a subset of 
monobodies are capable of detecting such structural differences. It is interesting that 
the clone E2#l 1, which gave the lowest P-galactosidase activity among those tested, 
was most sensitive to the differences among these agonist complexes. These results 
suggest that monobodies with weak binding affinity may be quite useful for detecting 
subtle conformational differences, consistent with the presence of a dynamic 
conformational ensemble. They also suggest that the energetic barrier among the ERa 
conformations induced by these agonists maybe quite low so that monobodies and 
coactivators that bind tightly to ERa may be able to promote the "induced fit" of the 
ERa conformation. Paige et al. have shown that these agonists induce distinct 
conformations in full-length ERa and ER(3 that are detectable using in vitro binding 
assays of ER-binding peptides (Paige et al., 1999). 

The above result also demonstrate that monobodies can be used as 
modulators of biological functions. Although the inhibitory activity of the first- 
generation monobody was modest, the binding affinity and specificity of monobodies 
could be improved by introducing additional mutations in adjacent loops (see Figures 
1 A-B) and performing further rounds of selection with a higher degree of stringency. 
Prior studies have demonstrated that the monobody scaffold can accommodate many 
mutations in multiple loops (Koide et al., 1998). Peptide aptamers based on a single 
loop and antibody fragments ("intrabodies") have been shown to be effective 
inhibitors of intracellular processes (Colas et al., 1996; Richardson & Marasco, 1995). 
Therefore, monobodies with potent inhibitory activity can also be developed. 
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